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Abstract 

We consider two continuous Ito semimartingales observed with noise and sampled at stopping times 
in a nonsynchronous manner. In this article we establish a central limit theorem for the pre-averaged 
Hayashi-Yoshida estimator of their integrated covariance in a general endogenous time setting. In particu- 
lar, we show that the time endogeneity has no impact on the asymptotic distribution of the pre-averaged 
\q • Hayashi-Yoshida estimator, which contrasts the case for the realized volatility in a pure diffusion set- 

ting. We also establish a central limit theorem for the modulated realized covariance, which is another 
pre-averaging based integrated covariance estimator, and demonstrate the above property seems to be a 
f"H ■ special feature of the pre-averaging technique. 



Keywords: Central limit theorem; Hitting times; Market microstructure noise; Nonsynchronous observa- 
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1 Introduction 



Let X = (X t )t£M + be a continuous semimartingale on a stochastic basis (Q, J 7 , (Ft), P)- Suppose that for 
each n we have a sequence (i n )igz + of (J r t )-stopping times such that £™ = and tf t 00 as i - > °o. Then, as is 
£\j ■ well known, for any t E M + the quantity RV™ := J2i-t n <t(Xt™ — ^t n _ 1 ) 2 converges to the quadratic variation 

CN ! [X] t of X in probability as n — > oo, provided that A n (t) :— supj(i™ At — A t) — > p (see Theorem 1-4.47 

in [29] for example). Here, the notation — > p means convergence in probability. In recent years this classic 
, result has been highlighted in the context of the high-frequency data analysis. In the econometric literature, 

the quantities RV™ and [X] t are called the realized volatility (RV) and integrated volatility (IV) (up to the 
time t) respectively. Then, the above result is equivalent to say that the RV is a consistent estimator for 
the IV if A n (t) — > p as n — > oo. Because the importance of the IV as an index of the volatility of assets 
has been recognized since a series of studies by Andersen and Bollerslev [1,2] and the increasing availability 
of high-frequency data in finance makes the assumption A n (t) — > p reliable, the statistical theory for the 
5^ , estimation of the IV has been developed by many authors recently. 

One of the interesting topics after the consistency is the asymptotic distribution theory. In the context 
of the statistical estimation of diffusion parameters, such a theory has already appeared in Dohnal [12] and 
Genon-Catalot and Jacod [19, 20]. See also the recent works of Uchida and Yoshida [48] and Ogihara and 
Yoshida [40]. Also, the early limit theory for the RV was developed in Jacod [26], Jacod and Protter [28] and 
Zhang [51] in different contexts. In the present situation, under some regularity conditions Barndorff-Nielsen 
and Shcphard [5] developed a "feasible" central limit theorem 

RV "=Q A N(0, 1) as n -> oo (1.1) 

with the regular sampling case t™ = i/n. Here, the notation — > means convergence in distribution and the 
quantity RQ™ defined by RQ™ = ^.(,<t(% — V*™^) 4 is sometimes called the realized quarticity. In this 
case, even the second-order asymptotic expansion of the statistic in the left-hand side of (1.1) was developed 
in Yoshida [50]. 
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It is natural to ask what happens when we consider more general stopping times as the sampling times (t"). 
In fact, Barndorff- Nielsen and Shephard [6] and Mykland and Zhang [38] showed that the convergence (1.1) is 
also valid with more general deterministic sampling times. Moreover, even in the case that the sampling times 
could be random and endogenous (i.e., path-dependent) (1.1) is still valid as long as (t™) satisfies a kind of 
strong predictability condition, as shown in Hayashi ct al. [23], Hayashi and Yoshida [25] and Phillips and Yu 
[42]. Here, the strong predictability condition intuitively means that the future sampling times are determined 
with delay. See [23] and [25] for more precise definition. Such a kind of condition has already appeared in [20] 
and [26]. Dropping the strong predictability condition is much difficult. For some special hitting-time-bascd 
sampling schemes, (1.1) was verified by Fukasawa [16] and Fukasawa and Rosenbaum [18]. However, the 
convergence (1.1) could fail for general endogenous sampling times. In fact, Fukasawa [15, 17] showed that 
the asymptotic distribution of the RV is determined by the asymptotic skewness and kurtosis of observed 
returns. More precisely, suppose that X is a continuous local martingale with £J[(X)^] < oo for simplicity. 
Then, set Gj n = E[(Xf!- +1 — X t ^) k \Ftv\ for every j, n and each k = 2, . . . , 12 and suppose also that there exist 
(Jt)-adapted locally bounded left continuous processes u and v such that Qj n /Q 2 n = v^nT 1 ! 2 + o p (n -1 ' 2 ), 
Qj,JQl„ = urjn- 1 + Opin" 1 ) and QfJQ}^ = o p {n~ k l 2 ) (k = 3,4,6) uniformly in j with ^<iasn^ oo. 
Suppose further that J2jt <t&j n = O p (l) as n 4 oo. Then, the asymptotic distribution of the (scaled) 
estimation error y / n(RV" — [X)t) of the RV is given by 



where W is a standard Wiener process independent of J- . See Theorem 3.10 of [17] for details. This type of 
result was also obtained by Li et al. [34]. We call the first integral in (1.2) limiting bias, following [18]. 

Though the limit theory viewed in the above provides us a beautiful framework for estimating the IV from 
high-frequency financial data, we encounter another problem called market micro structure noise when we 
focus on ultra high-frequencies. For this reason, recently many authors have proposed alternative estimators 
for the IV in consideration of microstructure noise e.g., the two-time scale realized volatility of [53], realized 
kernel of [3], pre- averaging estimator of [43, 27] and realized quasi-maximum likelihood estimator of [49]. The 
aim of this article is to answer a natural question that what happens in the asymptotic distribution of such 
a kind of estimator when sampling times are random and endogenous. This type of problem has been well 
studied in recent years when sampling times are deterministic or random but independent of observations 
in connection with the problem of nonsynchronous observations, which is another important problem for 
analyzing high-frequency data of multiple assets. See [4, 8, 11, 46] for example. The case that both of the 
microstructure noise and the time endogeneity are present was considered in Li et al. [35] , and in that article 
they constructed a new estimator and developed an asymptotic distribution theory of it. 

In this article we will focus on the pre-averaging estimators, especially the pre-averaged Hayashi- Yoshida 
estimator (PHY) proposed in Christensen et al. [10], which is a pre-averaging version of the Hayashi- Yoshida 
estimator proposed in Hayashi and Yoshida [24]. For the case with deterministic sampling times, the asymp- 
totic distribution of this estimator was derived in [11]. The case that a kind of strong predictability condition 
holds true was also developed in [32]. In both of the cases no limiting bias appears, which is of course 
naturally predicted from the RV case. Interestingly, in this article we will show that nothing happens even 
if we drop the strong predictability condition in the above (more precisely, we can replace the strong pre- 
dictability condition in [32] by a kind of continuity for the conditionally expected durations). That is, the 
asymptotic distribution of the PHY does not change even in the presence of the time endogeneity, in par- 
ticular any limiting bias does not appear. This is quite different from the RV. Furthermore, wc will show 
our result in the bivariate setting with nonsynchronous observations because it causes no difficulty compared 
with the univariate setting. This is completely different from the no-noise case and reflects the fact that the 
nonsynchronicity of observation times is less important in the presence of noise, as shown in [9]. 

Compared with the estimator proposed in [35], the PHY has two advantages, except we can apply it to 
nonsynchronous data. First, it attains the optimal convergence rate. Second, we do not need to correct the 
limiting bias of the estimator, so that it is easier for implementation. 

The plan of this article is as follows. Section 2 presents the mathematical model and the construction of 
the pre-averaged Hayashi- Yoshida estimator. Section 3 is devoted to the main result of this article. Section 
4 provides some concrete examples of sampling times which are possibly endogenous. Section 5 discusses 
Studentization, autocorrelated noise and a comparison between some existing approaches, and Section 6 uses 
Monte Carlo simulations to verify the conclusions obtained from the previous sections. Most of the proofs 
are given in the Appendix. 
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Notation 

We denote by D(E + ) the space of cadlag functions on M + equipped with the Skorokhod topology. A 
sequence of random elements X n defined on a probability space (O, J-, P) is said to converge stably in law to 
a random clement X defined on an appropriate extension (ti,P, P) of (f2, P) if E[Yg(X n )} — > E[Yg(X)] 
for any J 7 - measurable and bounded random variable Y and any bounded and continuous function g. We 
then write X n —> ds X. A sequence (X n ) of stochastic processes is said to converge to a process X uniformly 
on compacts in probability (abbreviated ucp) if, for each t > 0, suPq< s <£ 

If a process V is (pathwise) absolutely continuous, we denote its density process by V . | • | denotes the 
Lebesgue measure. For a (random) interval I and a time t £ K + , we write I(t) — In [0,t). 

2 The setting 

2.1 Model 

Let # (0) = (ft (0) , J" (0 ),F(°) = (.Ft 0) )t e R +) P (0) ) be a stochastic basis. For any t G K+ wc have a transition 

probability Qt(u/°), dz) from (Of ', J, ) into M 2 , which satisfies J zQ t (ui^\dz) — 0. We endow the space 
n^l = (R 2 )[°.°°) with the product Borcl cr-field and with the probability Q(w (0) , dwW) which is the 
product <?)teR + Qt(^°\ ■)• We also call (et)t £ R + the "canonical process" on (OW, J"^)) and the filtaration 
= c(e s ; s < t). Then we consider the stochastic basis B = (0, J 7 , F = (J r t ) te R + ,P) defined as follows: 

P(d^°),d^ (1) ) = pWCdwWjQCwW.dwW). 

Any variable or process which is defined on either ftS^ or can be considered in the usual way as a variable 
or a process on Q. 

Now we introduce our observation data. Let X and Y be two continuous scmimartingalcs on Also, 
we have two sequences of -stopping times (5 l )i e z + and (T : >)j £ z + that are increasing a.s., 

S l t oo and T J t oo. (2.1) 

As a matter of convenience we set S^ 1 = T _1 = 0. These stopping times implicitly depend on a parameter 
n £ N, which represents the frequency of the observations. Denote by (b n ) a sequence of positive numbers 
tending to as n — > oo (typically b n — n _1 ). Let £' be a constant satisfying < £' < 1. In this paper, we 
will always assume that 

r n {t) := sup (S i At — A t) V sup (T J A i - T^ 1 At) = o p (b() (2.2) 
as n — > oo for any t £ M + . 

The processes X and F are observed at the sampling times (S l ) and (T-?) with observation errors (£/J^)igz + 
and (Ur£j)j£Z + respectively. We assume that the observation errors have the following representations: 

u& = &- 1/2 (z SI -2^-0 + 4, u% = b- l '\Y Tj -Y Tj -,) + 

Here, et = {^ti^t) for each t, while X_ an d H are two continuous semimartingales on We can take 

X = <p x X and Y_ — (j) Y Y for some constants (f> x and 4> Y , so that the observation errors can be correlated 
with the returns of the latent processes X and Y. Moreover, X_ and Y_ could also depend on the sampling 
times. For these reasons wc will refer to (b n 1 (X_ si — X_ S j-i))j e z and (bn 1 (Y_ Tj — Y_ T j-i))j e z + as the 

— 1/2 

endogenous noise. The factor b n is necessary for the endogenous noise not to degenerate asymptotically. 
Such a kind of noise appears in e.g., [4] and [31]. After all, we have the observation data X = (X S i)i £ % + and 
Y = (Y T3 ) iez+ of the form X s , = X S i + 17* and Y TJ = Y Ti + U Y . 

2.2 Construction of the estimator 

In this subsection we explain the construction of the pre-averaged Hayashi-Yoshida estimator. First we 
introduce a concept called the pre-averaging, which was originally proposed by [43] and generalized by [27]. 
We choose a sequence k n of positive integers and a number 6 € (0, oo) satisfying k n y/b^ = 9 + o(bl/ 4 ) as 
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n — > oo (for example k n = \8/y/h^]). We associate the random intervals P = [S 1 ' 1 , S l ) and J J = [T J ' _1 ,T J ') 
with the sampling scheme and (T- 7 ) and refer to I = (P).^ and J = (J J )j £ N as the sampling designs 
for X and 1". For a function a on R + , we introduce the 'pre- averaging observation data of X and Y with the 
weight function a and based on the sampling designs I and J respectively as follows: 

j h 1 

*<*(iy= J2 (^+,-^+,-1), v a (jo j '= E ( Y ^+* -YTi+«-o, «,j = o,i,.... 

In the following we fix a continuous function g : [0, 1] — > M which is piecewise C 1 with a piecewise Lipschitz 
derivative </ and satisfies <?(0) = g{l) = and tpHY : = f g{x)dx ^ (for example g(x) = x A (1 — ar)). 
The following quantity was introduced in Christensen et al. [10] : 

Definition 2.1 (Pre-averagcd Hayashi-Yoshida estimator). The pre- averaged Hayashi-Yoshida estimator 
(PHY) of X and Y associated with sampling designs I and J is the process 

PHY(X,Y;l,J)l l = — — - ^2 X £;( 2: )' Y s(^ r ) : ' 1 {[s\s i + fc ")n[TJ,TJ+ fc re ) 7 tiS} I t £ R+. 

WHY n) i,j=0 

S i+fc "VT j + fc "<t 

For a technical reason explained in [32], we modify the above estimator as follows. The following notion 
was introduced to this area in Barndorff-Niclsen ct al. [4]: 

Definition 2.2 (Refresh time). The first refresh time of sampling designs I and J is defined as R° = S°VT°, 
and then subsequent refresh times as 

R k := min{S' i |5 i > R^ 1 } V min{T J |T J > R^ 1 }, k = 1,2,.... 

We introduce new sampling schemes by a kind of the next-tick interpolations to the refresh times. That 
is, we define S° := S°, f° := T°, and 

S k :=min{5 4 |5 4 > R^ 1 }, f k := min{T j \T^ > R^ 1 }, fc=l,2,.... 
Note that S k is an F^°^-stopping time because 

Here, for a stopping time T with respect to filtration (J-j) and a set A G ^r, we define Ta by Ta(w) = T(w) 
if w e A; Ta(oj) = oo otherwise (see 1-1.15 of [29]). Similarly T k is also an F^°^-stopping time, hence so is 
R k . 

Then, we create new sampling designs as follows: 

I k :=[S k - 1 ,S k ), J k :=[f k - 1 ,f k ), I:=(?) ieH , J := ( J*) j&i . 

For the sampling designs X and J obtained in such a manner, we consider the pre-averaging observation data 
X(iy and Y(JT) J of X and Y based on the sampling designs I and J respectively i.e., 

X g (iy = J2 9 (y) (X^ + p - Xg^) , Y 9 (^ - £ .9 (Y f J+9 - Y fj+q _J , i, j = 0, 1, . . . . 

p— 1 ^ n ^ g— 1 ^ n ' 

We refer to these quantities as the pre-averaging data in refresh time. Finally, our objective estimator is 
given by PHY(X,Y) n := PHY(X,Y;X,J) n . More precisely, we have 

1 °° 

PHY(X,Y)t = ^ HYkn y X] X s( :z: ) lY 9(^)' ;i {[s i ,S'+ fc ™)n[fj,fj+ fc ")#0}' teR + . 

§ i+k ™ vf j+kn <t 
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3 Main results 



3.1 Conditions 

We start with introducing some notation and conditions in order to state our main result. First, for any 
continuous semimartingale Z on B^°\ we write its canonical decomposition as Z = A z + M z , where A z is a 
continuous F^°^-adapted process with a locally finite variation and M z is a continuous F^^-local martingale. 

Next, let TV™ = Y,kLi K' 1 = J2T=i 1 {s>'<t} and N ?' 2 = T,kLi 1 {f"<t} for each t e M + and 

T k = [R k - 1 ,R k ), I k :=[S k ,S k ), J k :=[f k ,f k ) 

for each k G N. Here, for each t G R + we write S k = sup si< g k S* and T k = sup TJ< ^ fc T J . Note that S k and 
T may not be stopping times. 

Let H™ = (H") t6ffi+ be a sequence of filtrations of F^ to which TV™, TV™' 1 and TV™' 2 are adapted. For 
each n, we also assume that A z and M z are adapted to H" for every Z G {X, Y, X., Y}. Then, for each n 
and each p > we define the processes x", G(p) n , F(p)"' 1 , F(p) n > 2 and i^l)"' 1 * 2 by 

G( P y: = E [(b-i\T k \) p \n n R ^] , F(p)^ = E [{b- l \i k \) p I^.J , F(p)«> 2 = E [(6- 1 ! J fc |) P 
x » = p(S k = f k \U Rk ^), F(l)^ 1 * 2 = b~ l E [\I k * J k \\H Rk ^] 

when s G T fc . Here, 7 fe * J fe = (7 fe n J fc ) U (7 fc+1 n J k ) U n J fc+1 ). 

The following condition is necessary to compute the asymptotic variance of the estimation error of our 
estimator explicitly. 

[HI] (i) For each n, we have a cadlag F^°'-adaptcd process G™ and a random subset Af® of N such that 
(#A/"°)„ s n is tight, G(l) Rk _! = G Rk _ 1 for any k G N - A/"°, and there exist a cadlag F (0) -adapted 
process G and a constant <5 > 1 — £' satisfying that G and G_ do not vanish and that b~ s (G n — G) 
as n — ^ oo. 

(ii) There exists a constant /? > such that (sup 0<s<t G(p)™) ngN is tight for all t > 0. 

(iii) For each n, we have a cadlag F( )-adapted process x' n an d a random subset A/"^ of N such that 
(#A/" r ' ) ng N is tight, Xflfc-i = Xflk-i f° r an y G N — A/"„, and there exist a cadlag F^-adapted process 
X and a constant 5' > 1 — £' satisfying Z^ 5 (x'™ — x) as n -> oo. 

(iv) For each n and Z = 1,2, 1 * 2, we have a cadlag F^°^-adapted process F 71 ' 1 and a random subset 
Af^ of N such that (#7v^)neN is tight, F(l)™fc_! = -F^i-i for any Zc G N — Af^, and there exist a cadlag 
F( )-adapted processes F z and a constant <5 ; > 1 — £' satisfying fo^ 5 (F™' ( — F ( ) — ^> as n — > oo. 

(v) There exists a constant p' > l/£' such that (sup 0<s<t F(p')™' 1 ) ngN is tight for all t > and Z = 1, 2. 

The following condition is a sufficient one for the condition [HI]: 

[Hi'] (i) There exists a number p > l/£' such that for every p G [0, p] we have a cadlag F^-adapted process 
G{p) such that G(p) n ^ G(p) as n — >• oo. Furthermore, G and G_ do not vanish and there exists a 
constant 6 > 1 - £' satisfying - G) ^> as n -> oo with G = G(l). 

(ii) There exist a cadlag F^-adapted process x and a constant 6 > 1 — £' such that Z?~ 5 (x™ — x) 
as n — > oo. 

(iii) There exists a number p > 1/fj/ such that for every Z = 1,2 and every p' G [0,p] we have a 
cadlag F'°'-adapted process F(p) 1 such that F(p) n ' 1 F(p) 1 as n — > oo. Furthermore, there exists 
a constant 5 > 1 - £' satisfying 6" <5 (F(l)"' i - F z ) ^> as n -> oo with F ( = F(l) ! for each Z = 1, 2. 

(iv) There exist a cadlag F( )-adapted process F 1 * 2 and a constant 6 > 1 — £' such that 6~ (5 (F(1)"' 1 * 2 — 
F i*2) o as oo. 

Remark 3.1. An [Hi"] type condition appears in Hayashi et al. [23] (see assumptions E(g) and E'(g) of 
[23]). The reason why we introduce a kind of exceptional sets Af^ (I = 0, 1, 2, 1 * 2,' ) is that the condition [HI] 
without them is too local. To explain this, we focus on the univariate case. Note that in this case we have 
R k = S k (k = 0, 1, 2, . . . ). Let r be a positive number and suppose that (S l ) be a sequence of Poisson arrival 
times whose intensity is A before the time r and A after r. Then the structure of the process G(l)™ becomes 
very complex around the time r (of course if X =/= A), so that it will be difficult to verify the convergence 
G(l) n ^> G because it requires a kind of uniformity. See also Example 4.4. 
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Next we introduce a kind of continuity of a stochastic process which we mentioned in the introduction. 
Definition 3.1. Let A G [0, 1]. 

(i) A process V is of class (Ax) if there is a positive constant C satisfying 

E [\V T1 ~ V T2 \ 2 \T TlAT2 ] < CE [|n - T-al 1 -*! F TlAT2 ] 

for any bounded F*- ) -stopping times t\ and t^. 

(ii) A process V is of class (ALa) if there is a sequence (ofc) of F( )-stopping times such that f oo as 
k — > oo and the stopped process V^ crfc is of class (A\) for every k. 

If both of processes V and W are of class (ALa) for some A G [0, 1], then the process V + W is obviously 
of class (ALa). Moreover, the class (ALa) is non-increasing in A. That is, if < Ai < A2 < 1 and a process 
V is of class (ALa x ), then V is also of class (ALa 2 ). In fact, if r is an F^°^-stopping time such that V T is of 
class (AajJ, then y rhK is of class (Aa 2 ) for any K > 0. This implies V is of class (ALa 2 ). 

In the following processes of class (ALa) f° r an y A G (0, 1] play an important role. Here we give some 
examples of such ones. 

Example 3.1. If B is an F^ ' -adapted process with a locally integrable variation and the predictable com- 
pensator of the variation process of B is absolutely continuous, then B is of class (ALo). 

Example 3.2. If L is a locally square-integrable martingale on and its predictable quadratic variation 
process is absolutely continuous, then L is of class (ALo). 

Example 3.3. For a real- valued function x on R + , the modulus of continuity on [0,T] is denoted by 
w(x; 6, T) = sup{|x(i) - x(s)\;s,t G [0,T],\s - t\ < 6} for T,6 > 0. Then, if an F^-adapted process V 
satisfies w(V; h, t) = O p (h^~ x ) as h — >• 00 for every t, A G (0, 00), then V is of class (ALa) for any A G (0, 1]. 
An interesting example of such ones which does not belong to the above examples is a class of fractional 
Brownian motions with Hurst indices greater than 1/2. 

Instead of a kind of strong predictability, we impose the following condition on the sampling times: 

[H2] (i) S 1 and T % are F^^-predictable times for every i. 

(ii) The process G in the condition [HI] is of the form Gt = V t G + Ylk=i Ik' wri ere V G is of class (ALa) 
for any A G (0, 1], N G is an adapted point process and (7)?) is a sequence of random variables. 

(iii) The process \ m the condition [HI] is of the form xt = ^ x + Y^,k=i Tfe > wnere V x is of class (ALa) 
for any A G (0, 1], N x is an adapted point process and (7^) is a sequence of random variables. 

1 p[ 1 

(iv) For each i = 1,2,1*2, the process F l in the condition [HI] is of the form Fj: = V t F + J2k=i ~fk > 
where V F is of class (ALa) for any A G (0, 1], N F is an adapted point process and (7^ ) is a sequence 
of random variables. 

Remark 3.2. (i) We will explain why we need the condition [H2](i) in Remark 3.3. This condition is not 
restricted in the framework of continuous processes because hitting times of continuous adapted processes 
are predictable. Note that S h , T h and R k are also F^ )-prcdictable times under [H2](i) by Eq. (2.3). 

(ii) The conditions [H2](ii)-(iv) are also not restricted at least in the univariate case (in the univariate case 
we have G = F 1 = F 2 = F 1 * 2 and \ = 1, so that it is sufficient that [H2](ii) holds). For example, renewal 
sampling schemes satisfy these conditions because the conditionally expected durations of such schemes are 
constant. Other examples satisfying [H2] are given in Section 4. In particular, sampling times generated by 
hitting barriers satisfy [H2] (see Example 4.1) and in this case the asymptotic skewness of returns do not 

vanish. We involve terms with finite activity jumps such as ^fc=i 7^ m P2] to treat sampling schemes as 
stated in Remark 3.1 (see also Example 4.4). 

(iii) We also remark that in the econometric literature conditionally expected durations are often modeled 
by G ARCH- type models (such as the ACD model of [13]) or SV-type models (such as the SCD model of [7]). 
Since such models can be approximated by Ito semimartingalcs (see [36] and references therein), [H2] is also 
not restricted from the econometric point of view in the light of Example 3.1-3.2. 

The volatility processes should also have a kind of continuity: 

[H3] For each V, W = X, Y, X_, Y_, [V, W] is absolutely continuous with a cadlag derivative, and the density 
process [V, W}' is of class (ALa) for any A G (0, 1]. 
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In consideration of Example 3.1-3.3, [H3] is standard in the literature; see e.g., [23] and [25]. 
The maximum of the durations need to have a fairly fast convergence speed. 

[H4] | < £' < 1 and (2.2) holds for every i e R+. 

An [H4]-type condition often appears in the literature (e.g., [8, 25, 34]). As naturally expected, this condition 
have a connection with the condition [HI]. To explain this, we introduce an auxiliary condition. Let p be a 
positive number. 

[K p ] The sequence of the processes (sup 0<;j<t C(p)")„ eN ^ tight as n — > oo for all t > 0. 

Lemma 3.1. Suppose that [Hl](i) and [K p ] hold for some p > 1. Then sup 0<t<T \T N * +1 | = O p (bn 1//p ) as 
n — > oo for any T > 0. 

Proof. By an argument similar to the proof of Lemma 10.4 of [32], we can show that Nj, = O p (6~ 1 ). 
Therefore, the Lenglart inequality and [K p ] yield Efj^ 1 |r fe | p = Opib^ 1 ). Since {sup < t < T \T N " +1 \} P 
— 12k=i~ 1 \T k \ P i wc complete the proof of the lemma. □ 

Since r n (t) < 2sup fc |r fc (i)| < 2sup 0<s<t |r Ar " +1 |, we obtain the following result: 
Corollary 3.1. [H4] holds true if [K p ] holds for some p > 6. 

We impose the following regularity conditions on the drift processes and the noise process: 

[H5] For each V = A x , A Y , A—, A—, V is absolutely continuous with a cadlag derivative, and the density 

process V is of class (ALa) for some A G (0, |). 
[H6] (i) (J |z| 8 <5t(dz)) te R + is a locally bounded process. 

(ii) The covariance matrix process 'ft (a/ - 1 ) = J zz*Q t (uj l - \dz) is cadlag and quasi-left continuous. 

(hi) For every i, j = 1,2 the process VP 1 - 7 is of class (ALa) for any A € (0, 1]. 



Remark 3.3. The conditions [H2](i) and [H6](ii) are necessary by the following technical reason. In the 



proof we will regard the noise process (e^ ) as the (martingale) differences of the purely discontinuous locally 



square-integrable martingale e § P ^{§p<t} on ^" Then we need to consider the predictable quadratic 

variation process (with respect to the filtration F) of this process. [H2](i) and [H6](h) ensure it is given by 
Y^,pLi ^^p^{§p<t}- Also, [H6](ii) is satisfied by many processes. For example, all of the processes in Example 
3.1-3.3 satisfy it. We refer to [29] for more details on the concepts appearing here. 

Finally, we introduce constants appearing in the representation of the asymptotic variance of our esti- 
mator. For any real- valued bounded measurable functions a, [3 on R, we define the function i\) a $ on R by 
4>u,p{x) = Jq Jx+u-i ce(u)/3(v)dvdu for every x <G R. Then, we extend the functions g and g' to the whole 
real line by setting g(x) = g'(x) = for x [0, 1] and put 

2 p2 j-2 

ip g ^(x) 2 dx, k := ip gli g>(x) 2 dx, k := ifj g ^(x) 2 dx. 

2 J -2 J -2 

3.2 Results 

Now we are ready to state the main theorem of this article. 

Theorem 3.1. (a) Suppose [Hl](i)-(iii), [H2] (i) — (iii) and [H3]-[H6] are satisfied. Suppose also that X_ = 
Y = 0. Then 



6- 1 / 4 {PHF(X,Y)" - {X,Y]}^ d ° [ w s dW s 

Jo 



(3.1) 



as n — > oo, where W is a one- dimensional standard Wiener process (defined on an extension ofB) independent 
of J- and w is given by 

w 2 s = ^[0K{[X\m + ([X, Y]' s f}G s + + (^x^jG; 1 

+ e-^ilXY^f + [Y]'^] 1 + 2[X, Y]'^] 2 x s }}. (3.2) 
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(b) Suppose [H1]-[H6] are satisfied. Then (3.1) holds asn-> oo, where W is as in the above and w is given 

by 



6k{[X]' s [Y]' s + ({X,Y}> s f} G s + 9- 3 n {*>f + ) 2 | G- 1 
+ 6- 1 K{[X}' s vf + [YY^l 1 + 2[X,Y]' S ^ ({X,Y}' S F} - [Y, Y]' S F S 2 ) 2 G; 1 } 
with 9? = + [X]' s Fj, *f = tff + m > sF 2 and ^ = ^,u Xs + [x,Yy s F}* 2 . 



(3.3) 



Proof of this theorem is given in Appendix A. As was announced in the introduction, the time endogeneity 
has no impact on the asymptotic distribution of the pre-averaged Hayashi-Yoshida estimator, compared with 
Theorem 3.1 of [32]. It is also worth noting that the endogeneity of the noise pushes down the asymptotic 
variance. 

In the univariate case, we have S = T k = R k for all fc, so that 

^^ = j^ E WW 

for each t e K+. Moreover, [HI] (i)— (ii) and [H2](ii) implies that [Hl](iv)-(v) and [H2](iv) respectively, and 
[HI] (hi) and [H2] (iii) are automatically satisfied because x™ = 1. Consequently, we obtain the following 
result: 

Corollary 3.2. Suppose [HI] (i) -(h), [H2](i)-(ii) and [H3]-[H6] are satisfied with taking S k = R k for every 
k. Then 

t„- 1/4 {My(X,X)"^[I]}^ f w s dWs inB(R+) 
as n — > oo, where W is as in the above and w is given by 



i/' 4 



HY 



*") 2 ± + 26- 1 K[xy s * 11 



Interestingly, both of the endogeneity of the sampling times and the noise have no impact on the asymp- 
totic distribution. 

Remark 3.4. (i) A brief explanation of the reason why the asymptotic skewness of returns has no impact on 
the asymptotic variance of the PHY can be given in the following way. For simplicity we focus on the univari- 
ate case without the noise and drift. Then, the predictable quadratic covariation of the estimation error of 
the PHY and the martingale X is given by the sum of terms like E^o" 1 f(^r)[^]( /i+P ) E^q 1 fd^)^ 7 ^ 9 ) 
with \i — j\ < k n . In such a term, variables corresponding to the third power of returns (i.e., terms involving 
variables like [X](I k )X(I k )) have no impact in the first order. By a similar reason the asymptotic kurtosis 
of returns also has no impact on the asymptotic variance of the PHY. 

(ii) Due to Lemma 3.1 of [32], in the estimation error of the PHY we can replace the (pre-avcraging version of) 
Hayashi-Yoshida type sampling design kernel lrrgi § i+k ^)n[f^ fi+ k ™)=£$} by a certain deterministic function. 
This enables us to handle the nonsynchronous case with no difficulty. This is quite different from the case 
for the Hayashi-Yoshida estimator in a pure diffusion setting, in which the Hayashi-Yoshida sampling design 
kernel plays a central role in the first order calculus. 



4 Examples 

4-1 Univariate case 

Example 4.1 (Times generated by hitting barriers). This example was treated in Section 4.4 of [17] and 
Example 4 of [34] . 

Suppose that [H3] is satisfied and both [X]' and [X]'_ do not vanish. Define 

5°=0, S i+1 = inf |t > S i \Mj K — Mgi = —u\fb~^ or — Mg t = v\/b~^ (4.1) 
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for positive constants u, v. Then, using a representation of a continuous local martingale with Brownian 
motion, we have 

P (ai£i+i - M§> = -u^/b^j = v/{u + v), P (m§ +1 - Mgi = v^/b^ = u/(u + v). 

Combining the above formula with Proposition 2.1 of [39] (again using a representation of a continuous local 
martingale with Brownian motion), we obtain the following result: for each r > 1 there exists a positive 
constant C r such that — LY] S i| r ] < C r b r n for every n,i. In particular, this inequality yields [K p ] 

holds for any p > 1 because inf < s <t[X]^ > for any t > 0. Therefore, (2.2) holds for any £' G (0,1) 
by Lemma 3.1. Noting that these results and the condition [H3], we can also show that [HI] holds with 
H" = FW and G s = uv/[X] s . This result also implies that [H2](ii) holds true. Finally, [H2](i) is also 
satished because M x is continuous. 

Note that in this example the asymptotic skewness of the returns does not vanish if u =/= v. 

Remark 4.1. In Example 4.1, the stable convergence results of Theorem 3.1 still hold when we replace M x 
in (4.1) by X. This can be shown by the following way: first, by a localization argument it is sufficient to 

consider processes stopped at some positive number T. Let Z t = cxp ( J* (A x )' s /[X] s dMf — ^A x ^j. As is 
well known, Z t is a positive continuous local martingale. Therefore, again by a localization argument we 
may assume that both Z and l/Z are bounded. In particular, Z is a martingale, so that we can define a 
probability measure Pt on (S1,.Ft) by Pt{E) = P(\eZt)- Pt is obviously equivalent to the probability 
measure P restricted to (fi, Tt)- Then, by Girsanov's theorem X is a continuous F( )-local martingale under 
Pt, hence [HI], [H2] and [H4] hold true under Pt- Moreover, [H3] and [H5]-[H6] are also satisfied under 
Pt due to the Baycs rule. Therefore, (3.1) holds true under Pt- Since the stable convergence is stable by 
equivalent changes of probability measures, (3.1) also holds true under the original probability measure P. 
Further, in this case we do not need (A x )' is of class (ALa) for some A € (0, j). 

Example 4.2 (General return distribution). This example was considered in Section 4.3 of [15] and Example 
5 of [34], and can be regarded as a generalization of Example 4.1. 

Let W be a one-dimensional standard Wiener process on a stochastic basis (fi', J 7 ', (F' t ),P'). Suppose 
that VP is adapted to the filtration (-F t '). Let fi be a probability measure on K with mean 0, and suppose 
that fi is not a Dirac measure i.e., ^({0}) < 1. Then, by Lemma 108 in Chapter 1 of [14] we can construct 
an i.i.d. random vectors (Uo, Vq), (U\, V\), . . . on a probability space (fi" \T" ,P") satisfying the following 
conditions for every i: 

(i) Ui,Vi > a.s., 

(ii) For any x € M fx((—oo,x]) = J n „ Gu i ^ 0J /i) ! v i (u}"){x)P" (du>"), where for u, V > G UlV is the distribution 
function of the random variable ( such that P(( = —u) = 1 — P(( = v) — v/(u + v). 

Now construct the stochastic basis B^°' by 

fi(°)=fi' xfi", J*® = T' ® T'\ Tf^T'^T", p(°»=P'xP". (4.2) 
Then, we define (5 1 ) sequentially by S° = and 

S l+1 = inf {t > S l \W t - W s , = -Ut^bZ or W t - W S i = V t v^} % = 0, 1, . . . . 

By construction S 1 is an F^^-predictable time for every i and (Ws*+i — Wjs*)iez + is a sequence of independent 

— 1/2 

random variables. Furthermore, Lemma 115 in Chapter 1 of [14] implies that (2.1), b n (Wgt+i — Ws*) ~ /U 
and E[S' l+1 — S' 1 } = b n J R x 2 fi(dx). This is known as the Skorohod representation, which is closely related to the 
so-called Skorohod stopping problem (see [39] for details). In the present situation we need the predictability 
of S z , so that we give the precise construction of S l . 

Now we verify the conditions [H1]-[H2] and [H4]. For this purpose we need to assume that L |x| fc /i(dx) < 
oo for some k > 12. Then, Proposition 2.1 of [39] yields [K k / 2 ], so that [Hl](ii) holds. Moreover, [H4] is also 
satisfied by Lemma 3.1. On the other hand, letting H" being the filtration generated by the processes Wt, *f?t 
and the process J2i l{S*<t}; [HI] holds true with G s = L x 2 fi(dx). Thus [H2] is also satisfied. 

Example 4.3 (Dynamic Mixed Hitting-Time Model). This model was introduced in Renault et al. [44] and 
also discussed in Example 6 of [34]. 
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First we construct the stochastic basis B^ which is appropriate for the present situation. Let (f2', P' , (J 7 /), 
P') be a stochastic basis, and suppose that the semimartingales X and X. ar e defined on this basis. Suppose 
also that "J is (J-" t ')-adapted. Moreover, suppose that there exist a one-dimensional standard Wiener process 
W and two positive cadlag adapted processes \i and c on (Q',P', {P[), P'). On the other hand, let (Ci)iez + 
be positive i.i.d. random variables with mean 1 on an auxiliary probability space (f2", P" , P"). Then we 
define the stochastic basis B^ by (4.2). 

We define the sampling scheme (S -1 ) sequentially by 5° = and 



S l+1 = inf {t > S'lWt - W s * + b-V^si (t - S*) = bl/ 2 c S i(i} 



By construction S" 1 is an F^^-predictable time for every i. Moreover, the conditional distribution of S' l+1 — S' 
given P^ is the inverse Gaussian distribution IG(bn Cs*Cij bn where the probability density function 
of the inverse Gaussian distribution IG(6, 7) is given by 

p(z;S,j) = -^=z" 3/2 exp (-i (— + ^z] X , z > 0. 



/2tt 

In order to verify the conditions [H1]-[H2] and [H4], wc additionally make the following assumptions: both 
c_ and \X- do not vanish, -0 := c/fj, is of class (AL\) for any A > and i?[|Ci| p ] < 00 for some p > 6. Then, 
letting being the cr-field generated by T' t and the random variable J2 i ljs^t} f° r each t G M+, wc have 



G{p) n s >=E 



0,1,. 



by Eq. (2.16) of [30], where K\ is the modified Bessel function of the third kind and with index A. Now 
we notice that the following properties of the function K\. First, Theorem 1.2 of [33] implies that for any 
A > there exist a positive constant C\ such that K\(x) / 'K\-i{x) < C\(l + x^ 1 ) for any x > 0. Second, 
for any x > 0, K\(x) is strictly increasing in A for A > 0. This follows form Eq. (2.12) of [33]. These facts 
yields the condition [K p ], hence Lemma 3.1 implies that [H4] holds true. Moreover, by construction is 
independent of Q, hence we have [HI] with G = ip. From this we also obtain [H2]. From this model we can 
obtain endogenous sampling times by giving a correlation between X and W. 



4-2 Nonsynchronous case 

Example 4.4 (Poisson sampling with a random change point). This example is a version of the model 
discussed in Section 8.3 of Hayashi and Yoshida [25]. 

As in the preceding example, we first construct an appropriate stochastic basis B^°\ Let (fi', J 7 ', {P' t ), 
P') be a stochastic basis, and suppose that the semimartingales X, Y, X_ and Y_ are defined on this basis. 
Suppose also that VP is (.F t ')-adapted. Furthermore, on an auxiliary probability space (O", P", P"), there are 

mutually independent standard Poisson processes QV*), (N t ) (k = 1,2). Then we construct by (4.2). 
Next we construct our sampling schemes. For each k = 1,2, let p k ,p k € (0,oo) and let r k be an {P^)- 

stopping time. Define (S_ l ) and (S ) each as the arrival times of the point processes iV"' 1 = (iV* i t ) and 
TV"' 1 = (N n pi t ) respectively. Then, we define (S l ) sequentially by S° = and 

S ' = ii{-( S '" 1<: ^ <Tll '' Tl+S ){S'- 1 <ri + S m }} ' 1 = 1,2,.... 

(T j ) is defined in the same way using iV"' 2 = {N_l p2t ), iV™' 2 = (iV^J and r 2 instead of N nA , iV"' 1 and r 1 
respectively. 

Let H™ be the filtration generated by the a- field P' and the processes N n,1 ,N n > 2 . Then, in a similar 
manner to Section 5.2 of [32] we can show that [HI] and [H4] arc satisfied with b n = n~ l , x = and 

/ 1 1 1 \ / 1 1 

\p L p z p L + p Z J \p p Z 

\ / 1 1 

l{T 2 <s<r 1 } + =1 + Z2 




p 1 +p 2 
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and 



1*2 ~ 2 2 2 2 

F s " pl+^M^^Ar 2 } + _l +p2 l{Tl<»<T»} + pl + -2 1 {r 2 < S <ri} + =1 + _ 2 l{ri VrK S } ■ 

From these formulae [H2](ii)-(iv) also holds true. Finally, [H2](i) is also satisfied because S_ l , S , T' and T l 
are F^-predictable times for all I. 

Example 4.5 (Continuous-time analog of the Lo-MacKinlay model). In Lo and MacKinlay [37] they regarded 
the nonsynchronicity of observations as a kind of missing observations. That is, they first considered a series 
ri, T2, ■ ■ ■ of completely synchronous latent returns. Here, the random vector r< = (r\ , . . . , rf) represents the 
vector constituted of the latent returns of d assets for the £-th period. In each period t there is some chance 
that the transaction of the fc-th asset does not occur with certain probability p k . If it does not occur, the 
observed return (r k )° of the k-the asset for the t-th period is simply 0. On the other hand, if its transaction 
occurs in the t-th period, (r k )° becomes the sum of its latent returns for all past conservative periods in 
which its transaction has not occurred. Mathematically speaking, we have (r k )° = J2i^o ; CO't-i! where 
X k (i) = (1 — 8* ) YVj—i Sf_j and (St) is an i.i.d. sequence of Bernoulli variables with probabilities p k and 
1 — p k of taking values 1 and 0, which are independent of the latent returns. In the following we consider a 
continuous-time analog of this model. 

Let (0', J 7 ', (IF' t ), P') be a stochastic basis as in the preceding example. Suppose that there exists a 
sequence (T m ) m gz + of (F t ')-predictable times (which will depend on n) such that tq = and r m ^ oo as 
m — > oo. On the other hand, suppose that we have two sequences (S^ n ) m& z + and (<5 2 n ) me z + of i.i.d. random 
variables on a probability space (W , J 7 ", P"). Suppose also that they are mutually independent and P(8^ n = 
1) = 1 - P(S k n = 0) = p k e [0, 1) for each k = 1,2 and m e Z + . Then we define the stochastic basis 2? (0) by 
(4.2). 

Let M(i) k = min{M|Em=o( 1 " 5 m) = Then we definc ( S ") and ( Tj ) h Y S * = T M{iy and T j = 
TjVf(j)2. By construction S z and T J are F^°^-predictable times and satisfy (2.1). For example, if we take 
r m = mb n , then (S l ) and (T 3 ) becomes mutually independent Bernoulli sampling schemes (i.e. discretized 
Poisson sampling schemes). In this case it can be easily shown that [H1]-[H2] and [H4] holds with G s = 
(1 - p 1 )- 1 + (1 - p 2 )- 1 - (1 - pV)-\ Xs = (1 - - P 2 ), F* = (1 - P'y\ = (1 - P 2 )' 1 and 

F}* 2 = (2 (1 -^(l -p 2 ))(l -pV)" 1 . 

By including endogeneity in (r m ), we can also obtain endogenous sampling times. For example, let W 
be a one-dimensional Wiener process on (f2', J 7 ', (.F t '), P') and let (Cm)mez + be a sequence of i.i.d. positive 
random variables on (17", J 7 ", P"). We assume they are independent of ((8^, & 2 n ))m£Z + and have mean 1. 
Further, suppose that F[|(*;| p ] < oo for some p > 6. Then we define (r TO ) sequentially by To = and 
T m +i = inf{t > r m \W t - W Tm + b~~ X ^n(t - r m ) = bn 2 cC m } (m = 0,1,...). Here, p and c are some 
positive constants. This is a simple mixed hitting-time model considered in Example 4.3. Then, by a simple 
computation similar to that of Example 4.3 we can show that [H1]-[H2] and [H4] holds with 

Gs = (z — — f + z — — ^ — -. W W» Xs = (l-P 1 )(l-P 2 ), 
\1— p 1— p 1— p L p z J 

i> z,2_ V> E ,l*2_ 2 -(l-p 1 )(l-p 2 ). 



F 1 = y F 2 = y F 1 * 2 = - ^ F F U 

s 1-p 1 ' s 1-p 2 ' s 1-pV 



where ?/; = c/p- 



5 Application and discussion 

5.1 Asymptotic variance estimation 

The central limit theorem derived in Section 3 is infcasiblc in the sense that the asymptotic variance of 
the estimation error is unobservablc. In order to derive a feasible central limit theorem, we therefore need 
an estimator for the asymptotic variance. In this subsection we implement this with a kernel-based approach 
as in Section 8.2 of [25] and the second estimator in Section 4 of [11]. 
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For this purpose we need to construct global estimators for (i) the integrated (co-)volatility processes [X], 
[Y] and [X, Y] , (ii) the covariance matrix process \& of the noise process and (iii) the asymptotic variance 
process J Q ([A, Y]' S F^ — [X,Y\' S F 2 ) Gj 1 ds due to the presence of the endogenous noise. We have already 
established a class of estimators for the case (i) i.e., the PHY, so that in the following we construct estimators 
for the cases (ii) and (iii). 

For the case (ii) we know several consistent estimators when the noise process is i.i.d. One of the most fa- 
miliar estimators in such ones is the re-scaled realized covariance: ^2 k . R k <t (Xg k — Xg fc _! )(Yf k —Yf k -i) / (2Nf). 
In the present situation, however, this estimator is not appropriate because X^ fc and Y^? fc _i are possibly cor- 
related due to the endogenous noise, for instance. Instead, we use the (scaled) symmetric first-order realized 
autocovariance estimator proposed by Oomcn [41]: 

iti 1 ) 12 — ~~2~j^2 E {( x s fc ~ x gfc-i)( Y f=+i ~ Y f fc ) + ( x gfc+i _ X 5&)( Y f& _ Y f fc -i)} ' teR + . 

" k:B, k + 1 <t 

Similarly we define 

7?(i) u = -i E (x^-xg^xxg^-xgO, 7 t "(i) 22 = -i E ( Y ^- Y f^)( Y T fe+1 - Y f-=) 

fc:S*+l< t fc:T fc + !<t 

for each t £ R+. For the case (iii), we use a pre-averaging based estimator. First, for measurable bounded 
functions a, (3 on R we introduce the following quantity: 



b^(X,y)? = -1 E x a (iyY P (J)\ tem + . 



i:R><t 

Next, fix a C* 2 function / on [0,1] satisfying /(0) = f(l) = f'(0) = /'(l) = (e.g., f(x) = x 2 (l - xf 
for x £ [0, 1]) and extend it to the whole real line by setting /(x) = for x £ [0, 1]. Then define S[/]" = 
{E f , f ,(X, Y)» - Z f ,j(X, Y)»}/(2||/'||1), where ||/'|| 2 = £ /'(x) 2 dx. 

Now we construct local estimators for the quantities appearing in the asymptotic variance (3.3) from the 
global ones introduced in the above. Let {h n ) n en be a sequence of positive numbers tending to as n — > oo, 
and define 



[xy s = h- 1 (phy(x, xy; - phy(x, x)f s _ M+ J , [y\> s = k 1 \phy(v, y)? - phy(y, Y)? s _ hn 

[Xy]' s = K 1 (PHY(X, Y)" - PHY(X,Y)l_ hn}+ ) , 9 7s "(l) n - ft" 1 { 7s n (l) U - T^.-^Cl) 11 } 

= k 1 {7 S n (i) 22 - 7r,-,„ )+ (i) 22 } , oy s \ir = k 1 {7r(i) 12 - %- hn) M 2 } ■ 

9H[/]« = / 1 - 1 (s[/]J-H[/]^ n)+ ) 
for each s <G R + . Then, we obtain the following result: 

Lemma 5.1. Suppose [H1]-[H6] and [K4/3] are satisfied. Suppose also that h^bl/ 4 — > as n — > 00. 27i 



(7/ 



(a) [X]' s [X]' s _, [Y]' s [Y]' s _ and [X,F]' S [A,Y]' s _ as n -> 00 for any s £ E+. Furthermore, 
sup < s < t |[A]'J, sup < s < t |[Y]'J and sup < s < t | [X, Y]'J are tight for any t > 0. 

1 1 99 1 9 

(b) 07?(l) u ^ p 6>- 2 * s _/G s _, <9 7s "(l) 22 ^ p 6»- 2 * s _/G s _, and <9 7 ™(1) 12 <r 2 *;L/(7 s _ as n -»■ 00 /or 
any s £ R+. Furthermore, sup < s < 4 [^(l) 11 !, sup < s < t |<97™(1) 22 | and sup < s < t |<97™(1) 12 | are tight for 
any t > 0. 

(c) cGI/]^ {[X,Y]' S „F^-[X,Y] S ^F 2 _)/6G S - as n ^ 00 for any s £ R+. Furthermore, sup < s < t |dS[/]™| 
is /or any t > 0. 

We give a proof of Lemma 5.1 in Appendix B. According to the above proposition, we can construct a 
kernel-based estimator for the asymptotic variance as follows. Set 



w 2 R k = k n 4>- 4 



HY 



-n{\x]' Rk d^ k (ir + MV^(i) 11 + 2[XY~y Rk d 1 - k (ir (as[/]» fc ) 2 }] |r fc ||r 



k+l\ 
(5.1) 
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for each k € N, and define AVAR t = J2k R k <t w2 R k f° r every t € R+. Then we obtain the following result: 

Theorem 5.1. Suppose [H1]-[H6] and [K4/3] are satisfied. Suppose also that h^bn — > as n — > oo. TTien 
iue Ziawe 6 fl 1 ^ 2 AVAR Ucp > J Q w 2 ds as n — > oo, where w s is defined by (3.3). 

Proof. Since J Q w 2 ds is a continuous non-decreasing process, it is sufficient to prove the pointwise conver- 

]/2 n 1/2 ■< 

gence. By Lemma A. 5 in Appendix A (or Lemma 10.2(a) of [32]), we have b n AVAR t — b n ^2k-R k <t w2 R k 

as n — > oo for every t, where w 2 R k is defined by (5.1) with replacing |r fc+1 | by G n Rk . Then, we obtain the 
desired result by Proposition 5.1 and the dominated convergence theorem. □ 

Combining Theorem 5.1 with Theorem 3.1, we obtain the following feasible central limit theorem: 

Corollary 5.1. Under the same assumptions as those of Proposition 5.1, we have 



PHY(X,Y)?-[X,Y] t 



AVAR, 



N(0,1) 



as n — > oo for any t G R+ whenever J * w 2 ds > a.s. 
5.2 Autocorrelated noise 

We have so far assumed that the observation noise is not autocorrelated, conditionally on J 7 ' '. In 
empirical studies of financial high-frequency data, however, there are a lot of evidence that microstructure 
noise is autocorrelated (see [22] and [47] for instance). In this subsection we shall briefly discuss the case 
that the observation noise is autocorrelated conditionally on J 7 ^ . 

We focus on the synchronous case. That is, we assume that S l = T % for all i. Note that in this case it 
holds that S k = T k = R k = S k for all k. Let (X l u ) u ez + and (/4j«eZ + {I = 1,2) be four sequences of real 
numbers such that 



oo 

E 

11=1 



oo 

u\X L u \ < oo and V] < °°- 

u=l 



We assume that the observation data (Xgi) and (Ygi) are of the form 



(5.2) 



u=0 



0. 



Ye 



E A ^-< 

u=0 



£ 1/a -2-*— 0- 



M=0 



(5.3) 



In other words, the observation noise follows a kind of linear processes. Under such a situation the asymptotic 
mixed normality of our estimators is still valid: 

Theorem 5.2. Suppose that S' 1 - T* for every i. Suppose also [Hl](i)-(ii) ; [H2](i)-(ii), [H3]-[H6], (5.2) and 
(5.3) are satisfied. Then (3.1) holds true as n — > oo with that W is the same one in Theorem 3.1 and w is 
given by 



1> 



HY 



Ok {[X]' s [Y]' s + ([A, YD 2 } G s + <T 3 « j*^f + (*f) 2 j G; 1 



li 



^ KG s ,ff = Ag *f + [Y]' S G S and^l 2 ^ XhX 2 ^l 2 +^l[X,Yy s G s , 



where X l Q = ^^L X l and = E^Lo /4i / or efflc/l * = 1, 2. 

We give a proof of Theorem 5.2 in Appendix C. The proof is based on a Bevcridge- Nelson type decompo- 
sition for the noise. In the nonsynchronous case, we will need to model the autocorrelation structure of the 
noise on the time dependence in calender time (as [47] did) rather than tick time (as in the above). This is 
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because we have two axes of tick time, (S l ) and (T- 7 ), in the nonsynchronous case and this fact complicates 
the analysis of our estimator. However, this topic is beyond the scope of this paper, so that we postpone it 
to further research. 



5.3 Comparison to other approaches 

In this subsection we shall briefly discuss the behaviors of other noise-robust volatility estimators in the 
presence of time endogencity. 

First we focus on the modulated realized covariance (MRC) proposed in Christensen et al. [10], which is 
another pre-averaging based covariance estimator. It can be basically considered as a pre-averaging version 
of the realized covariance i.e., H gjff (X, Y) n (with an appropriate scaling). This quantity, however, has a bias 
which is given by the covariance of the noise multiplied by some constant, so that we need to involve a bias 
correction term. For the reason presented in Section 5.1, we use the estimator 7"(1) 12 for estimating the 
covariance of the noise. More precisely, we define the process MRC(X, Y) Tl by 

MRC(X,Y)? = -f 5 ffl9 (X, Y)? - ^"(l) 12 , t £ K+, 

where ipi = J Q g'(s) 2 ds and -02 = /q g(s) 2 ds. 

For any a, (3 £ T, we define the function 4>a : fj on K by 4> a ,p{s) = / a(u — s)j5[u)du. After that, we set 

$11 = J* g ', 9 '(s) 2 ds, $22 = J 4>g, g (s) 2 ds and $ i2 = J 4> g . g {s)(j)g / , g >(s)ds. Then we obtain the following 
result: 

Theorem 5.3. Suppose [H1]-[H6] are satisfied. Then 

6" 1 / 4 {MRC(X ) Y) B - [X, Y}} I w s dW s i 

Jo 

as n — > oo, where W is the same one in Theorem 3.1 and w is given by 



w 2 s = 2^" 2 



#$ 22 {[X]' S [Y]' S + ([X,Y]' S ) 2 } G s + 0- 3 $ u S^l^f + ($f) 2 | G; 1 

+ O- 1 ^ {[XY^f + [Y]'^] 1 + 2[X,Y)' s *f (\X,Y]' s Fj - [X,Y]' s Fs) 2 G; 1 } 



(5.4) 



The proof is in Appendix D. The above result tells us that the time endogeneity also has no impact on the 
first order asymptotic property of the MRC. In this sense the MRC is better than the PHY because the former 
has usually smaller asymptotic variance than the later. However, the MRC is not robust to autocorrelated 
noise, so that in this article we mainly focus on the PHY for practical application. 

On the other hand, in general the time endogeneity seems to have some impacts on the asymptotic distri- 
bution of noise-robust volatility estimators. That is, the "robustness to the time endogeneity" is presumably 
a special feature of the pre-averaging technique. One of the evidences for this conjecture is the analysis in 
[35]. We give another heuristic evidence in the following. 

For simplicity we focus on the univariate case and suppose that = 0. We also assume that [K p ] holds 
true for some p > 2 and S°, S , . . . are independent of X . We shall consider the multiscale realized volatility 
(MSRV) proposed in Zhang [52] . The MSRV is given by the following formula: 

rnulm , — _ . . T . — _ 

= £^£ (XgJ _ x ^_ i)2) (5>5) 

»=1 j=i 

where N = #{i £ N\S l < 1}, M n = \cmulti\N\ with a positive constant c mu iti and 

12i 2 6i 6i 

ai ' M " = A/3 - M n ~ Ml^T ~ M*-M n 

This specification follows from Bibinger [8]. Then, by Lemma 2.2 of [23] and Theorem 2 of [8] the asymptotic 

— multi 

distribution of the (scaled) estimation error ^^{[X.X^ - [X,X] X ) of the MSRV is given by 
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under some regularity conditions, where £ is a standard normal random variable independent of T . The 
term G(2) s /G(l) s appearing in the above seems to reflect the asymptotic kurtosis of returns, so that time 
endogeneity seems to have an impact on the asymptotic distribution of the MSRV. Although the verification 
of this conjecture is left to future research, we will examine it numerically by a Monte Carlo study in the 
next section. 

Remark 5.1. (i) The term G(2) s /G(l) s also appears in the asymptotic variance of the realized kernel; see 
[4] for details. On the other hand, it is known that the asymptotic distribution of the realized quasi-maximum 
likelihood estimator is not affected by the irregularity of sampling times at least for renewal sampling schemes 
independent of observed processes, which is similar to our approach; see Section 4.3.2 of [49] and Corollary 
1 of [46]. 

(ii) The "robustness to the time endogeneity" property in the above has a good aspect and bad aspect. The 
good aspect is that we do not need to pay attention to the structure of sampling times precisely for statistical 
application of the estimator. The bad aspect is that we miss a chance to seek more efficient sampling schemes, 
as manifested in Section 5 of [17] for the RV case. This can be seen as a common trade-off between efficiency 
and robustness. 

(iii) It would be interesting to compare our approach with the model with uncertainty zones of [45] , which is 
another approach allowing us to deal with endogenous noise, endogenous times and nonsynchronous obser- 
vations simultaneously. 

6 Simulation study 

In this section we conduct a simulation analysis to illustrate the finite sample accuracy of some of the 
asymptotic results developed above. We focus on the univariate PHY. 

6.1 Simulation design 

We simulate data for one day (t € [0,1]). Following [34, 35], we consider sampling times generated by 
hitting barriers illustrated in Example 4.1. Specifically, we define the sampling times (S l ) by Eq. (4.1) with 
setting u = 0.01, v = 0.04, b n = n^ 1 = 3600 -1 and = crWt, where a = 0.02 and Wt is a one-dimensional 
Wiener process. This specification follows from Section 5 of [34]. For a comparison we also consider an 
equidistant sampling scheme i.e., S l = i/n. We will refer to the former as the hitting sampling and the later 
as the equidistant sampling, respectively. 

One of the novel findings of this article is that the PHY has no limiting bias even if the asymptotic 
skewness of the latent returns does not vanish, which contrasts the realized volatility. Therefore, it will be 
a good illustration to consider a situation that the limiting bias of the realized volatility significantly affects 
its asymptotic distribution. For this reason we adopt the bridge setting for the latent process, following [35]. 
Specifically, X is generated by a Brownian bridge with between and x\. An SDE specification for X can 
be written as 

dX t = ^-^ dt + adWf 

While the limiting bias of the realized volatility is proportionate to the value of x\ in the light of (1.2), an 
overlarge value will cause a significant bias due to the drift term. In consideration of this trade-off, we set 
x\ = a/2. 

To generate the microstructure noise process (Ugi), we consider the following three scenarios: 
Scenario 1: Ug t = i.e., the microstructure noise is absent. 
Scenario 2: £7* N(o, 7 cr 2 ), where we set 7 = 0.001. 
Scenario 3: Ugi = Sy/n(Xsi — Xg.-i), where we set S = — VO.001. 

The choices of the parameters in the above reflect the empirical findings reported in Hansen and Lunde [22] . 
That is, the variance of the noise is at most 0.1% of the integrated volatility and the noise is negatively 
correlated with the latent returns. Simulation results are based on 5000 Monte Carlo iterations for each 
model. 

The implementation of the PHY is as follows. Following [11], we use 9 — 0.15 and g(x) = x A (1 — x) 
and set k n = \9\/~N~\ for pre-averaging. Here, N represents the number of the observed returns. We also 
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computed the Studcntized statistic 



Sphy : 



PHY(X,X)1 



— a 



.2 



) 



/ 



AVAR 



n 



n 



where the estimator AVAR 1 of the asymptotic variance is constructed as in Section 5.1 with using h n = 
N~ - 2 . Note that we do not need to specify the function / for computing it because the process 92[/] n is 
identical to in the univariate case. 

For a comparison purpose we also computed the RV and the MSRV defined by (5.5) as well as their 
Studcntization. The Studcntization Srv of the RV was computed by the left-hand side of Eq. (1.1) (with 
t = 1). The tuning parameter c mu iu and the estimator AVKR, mu ui for the asymptotic variance of the MSRV 
were computed on the basis of Algorithm 2 of [8]. Then the Studcntization of the MSRV is given by 



6.2 Simulation results 

Table 1 reports the relative bias and the root mean squared error (rmse) of each estimator. That is, 
we report the sample mean and root mean squared error divided by a of each estimator in Table 1. Since 
the RV is inconsistent in the presence of noise, it does not perform at all in Scenario 2 and 3. As expected 
from the discussions until the preceding sections, the PHY has the smallest bias in the presence of the time 
endogeneity. Interestingly, even in the absence of noise the PHY has the smaller bias than the RV when the 
sampling times are endogenous. On the other hand, in each scenario the difference between the rmse values 
of two sampling schemes for each estimator (of course except for the RV in the noisy settings) is small, at 
least compared with those of the bias values. This is also implied by the theory developed above and the 
formula (1.2). 

Next we turn to the accuracy of the asymptotic approximation, which is the main theme of this article. 
In Figure 1 we plot the kernel densities of the Studentized statistics Sphy, Srv and Smsrv for Scenario 
1 with the equidistant sampling (on the left panel) and the hitting sampling (on the right panel). In the 
equidistant sampling case, all of the standard normal approximations perform fairly well. As expected 
from the asymptotic theory, Srv offers the best approximation. On the other hand, the standard normal 
approximations of Srv and Smsrv completely fail in the hitting sampling case. In fact, their densities shift 
to the right and become long and narrow to the lengthwise direction. This is exactly as expected from the 
asymptotic distribution (1.2) for the RV, while it is conjectured from the discussion in Section 5.3 for the 
MSRV. In contrast, the approximation of S'phy still perform fairly well. This is in line with the theory 
developed in this article. 

To test the normality of the Stidcntizcd statistics quantitatively, we compare their quantiles with those 
of the standard normal distribution for each scenario. The results are reported in Table 2-4. We also report 
the sample mean and standard deviation as well as the 95% coverage. Note that for Scenario 2 and 3 we do 
not report the results for the RV because of the lack of the consistency. As the tables reveal, we can again 
observe that the distributions of Srv an d Smsrv shift to the right in the hitting sampling cases. Also, the 
quantiles of Sphy (and Smsrv for the equidistant sampling case) don't look good enough, but this is not 
surprising because they have rather slower convergence speeds than Srv- I n fact, such an observation has 
already achieved in the literature (e.g., [3] and [27]). It is worth mentioning that the 95% coverage of Sphy 
seems to be fairly good in practice. It is also interesting to observe that the performance of Sphy in the 
hitting sampling case is superior to that in the equidistant sampling case. 

Now we make some efforts to improve the finite sample performance of the asymptotic approximation for 
the estimation error of the PHY. For this purpose we consider the log-transform 



By the delta method we have Si og — > N(0, 1) as n — > oo. It is well-known that this type of transformation 
often improves the finite sample performance of asymptotic approximations for volatility estimators based on 
high-frequency data (see [21] and [3] for instance). In fact, this phenomenon can be explained theoretically 
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Table 1: Relative bias and root mean squared error 





PHY 


RV 


MSRV 




Equidistant sampling 


Scenario 1 


-0.008 (0.089) 


0.000 (0.023) 


-0.001 (0.044) 


Scenario 2 


-0.005 (0.094) 


7.204 (7.208) 


-0.006 (0.074) 


Scenario 3 


-0.005 (0.091) 


3.407 (3.409) 


-0.002 (0.063) 






Hitting sampling 




Scenario 1 


0.006 (0.092) 


0.013 (0.023) 


0.012 (0.040) 


Scenario 2 


0.007 (0.097) 


6.971 (6.976) 


0.009 (0.075) 


Scenario 3 


0.006 (0.094) 


3.460 (3.461) 


0.012 (0.065) 



Note. We report the relative bias and rmse of the estimators included in the simulation 
study. The number reported in parenthesis is rmse. 



Table 2: Comparisons of quantiles of Studentized statistics with N(0, 1) (Scenario 1) 





Mean 


SD 


0.5% 


2.5% 5% 95% 


97.5% 


99.5% 


Cove. (95%) 










Equidistant sampling 








SpHY 


-0.19 


1.03 


1.82% 


4.88% 8.90% 97.38% 


99.18% 


99.94% 


94.30% 




-0.02 


0.99 


0.78% 


2.80% 5.32% 95.74% 


98.14% 


99.68% 


95.34% 


-Smsrv 


-0.07 


1.15 


1.88% 


5.54% 8.90% 93.82% 


96.54% 


99.12% 


91.00% 










Hitting sampling 








•SpHY 


-0.03 


1.01 


1.08% 


3.48% 6.68% 96.42% 


98.46% 


99.88% 


94.98% 




0.50 


0.73 


0.00% 


0.00% 0.16% 94.42% 


97.92% 


99.86% 


97.92% 


•Smsrv 


0.20 


0.71 


0.00% 


0.22% 0.66% 97.94% 


99.44% 


99.94% 


99.22% 



by higher-order asymptotic properties in some cases; see [21] for details. Furthermore, [21] pointed out that 
there exist alternative transforms outperforming the log-transform. Motivated by this study, we also consider 
the inverse transform 

a. :- (phyk x);) 3 lUEX^M -'V 

y AVAR™ 

following the suggestion of [21] for the RV. We show the results for these statistics in Table 5-7. We can see 
that the accuracy of asymptotic approximation is surprisingly improved across all the scenarios, compared 
with the raw statistic case. Further, in the equidistant sampling case S{ nv seems to work better than S\ og 
as predicted from the study of [21], while this observation looks reverse in the hitting sampling case. To 
understand these findings theoretically, we are likely to need a higher-order asymptotic theory for the PHY. 
This is more involved and of course beyond the scope of this article. 

Finally, in Figure 2 we present the QQ plots of the statistics Sphy, Slog and S- mv for Scenario 1 to 
complement these results visually. 
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Figure 1: Kernel densities of the Studentized statistics for Scenario 1 



equidistant 



hitting 



CD 
O 



o 



CM 

o 



o 
o 




CD 
O 



o 



C\J 

o 



o 
o 




Note. We plot the kernel densities of the Studentized statistics for Scenario 1. The left panel is for the 
equidistant sampling case and the right panel is for the hitting sampling case. The blue dashed line refers 
to Sphy, the red doted line refers to Srv, the green solid line refers to Smsrv and the black solid line refers 
to JV(0,1). 



Table 3: Comparisons of quantiles of Studentized statistics with N(0, 1) (Scenario 2) 

Mean SD 0.5% 2.5% 5% 95% 97.5% 99.5% Cove. (95%) 

Equidistant sampling 

Sphy -0.15 1.03 1.82% 4.98% 8.36% 97.58% 98.94% 99.90% 93.96% 

Smsrv -0.13 1.18 2.30% 6.32% 9.66% 93.66% 96.56% 99.12% 90.24% 

Hitting sampling 

Sphy -0.02 1.02 1.24% 4.00% 6.46% 96.34% 98.66% 99.86% 94.66% 

S'msrv 0.08 0.85 0.12% 1.04% 2.48% 96.80% 98.92% 99.84% 97. 



Table 4: Comparisons of quantiles of Studentized statistics with N(0, 1) (Scenario 3) 





Mean 


SD 


0.5% 


2.5% 5% 95% 


97.5% 


99.5% 


Cove. (95%) 










Equidistant sampling 








SpHY 


-0.15 


1.03 


1.60% 


5.18% 8.26% 97.24% 


99.06% 


99.96% 


93.88% 


S'msrv 


-0.07 


1.14 


1.70% 


5.28% 8.50% 94.06% 


96.68% 


99.16% 


91.40% 










Hitting sampling 








SpHY 


-0.03 


1.02 


1.10% 


3.52% 6.74% 96.20% 


98.42% 


99.88% 


94.90% 


Smsrv 


0.13 


0.82 


0.14% 


0.74% 1.68% 96.82% 


98.72% 


99.96% 


97.98% 
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Table 5: Comparisons of quantiles of transformed statistics with N(0, 1) (Scenario 1) 



Mean SD 0.5% 2.5% 5% 95% 97.5% 99.5% Cove. (95%) 

Equidistant sampling 

Si og -0.14 1.01 0.94% 3.62% 7.14% 96.32% 98.52% 99.80% 94.90% 

S inv -0.09 1.00 0.42% 2.40% 5.36% 95.10% 97.56% 99.48% 95.16% 

Hitting sampling 

Slog 0.03 1.01 0.70% 3.00% 5.30% 94.54% 97.72% 99.56% 94.72% 

0.08 1.02 0.26% 1.90% 4.04% 93.06% 96.40% 99.10% 94.50% 



Table 6: Comparisons of quantiles of transformed statistics with N(0, 1) (Scenario 2) 

Mean SD 0.5% 2.5% 5% 95% 97.5% 99.5% Cove. (95%) 

Equidistant sampling 

Si g -0.10 1.01 1.22% 3.60% 6.74% 96.52% 98.44% 99.78% 94.84% 

S„ v -0.05 1.00 0.40% 2.42% 5.00% 95.46% 97.68% 99.38% 95.26% 

Hitting sampling 

Si og 0.03 1.01 0.70% 3.00% 5.30% 94.54% 97.72% 99.56% 94.72% 

Si nv 0.08 1.02 0.26% 1.90% 4.04% 93.06% 96.40% 99.10% 94.50% 



Table 7: Comparisons of quantiles of transformed statistics with N(0, 1) (Scenario 3) 





Mean 


SD 


0.5% 


2.5% 5% 95% 


97.5% 


99.5% 


Cove. (95%) 










Equidistant sampling 








Slog 


-0.11 


1.01 


0.80% 


3.60% 6.90% 96.26% 


98.12% 


99.74% 


94.52% 


Sinv 


-0.06 


1.00 


0.26% 


2.24% 5.32% 95.24% 


97.48% 


99.38% 


95.24% 










Hitting sampling 








Slog 


0.02 


1.01 


0.64% 


2.36% 5.34% 95.02% 


97.52% 


99.70% 


95.16% 


Sinv 


0.07 


1.01 


0.24% 


1.50% 3.72% 93.28% 


96.34% 


99.12% 


94.84% 
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Figure 2: Normal QQ plots of the transformed statistics for Scenario 1 

standard log inverse 





Note. We plot the QQ plots of the Studentized statistics for Scenario 1. The upper panels are for the 
equidistant sampling case and the lower panels are for the hitting sampling. The left panels refer to Sphy, 
the middle panels refer to Si og and the right panels refer to Sk v . 



Appendix 

A Proof of Theorem 3.1 

We start by introducing some notation. Firstly we explain some generic notation. For processes V and 
W, V • W denotes the integral (either stochastic or ordinary) of V with respect to W. For any semimartingale 
V and any (random) interval I, we define the processes V(I) t and I t by V(I) t = /„' h(s-)dV s and I t = h(t) 
respectively. We denote by T the set of all real-valued piecewise Lipschitz functions a on M satisfying 
a(x) = for any x ^ [0, 1]. For a function a on I we write a™ = a(p/k n ) for each n £ N and p £ Z. For any 
semimartingale V, any sampling design V = (D l ) i£ ^ and any a £ T, we define the process V(T> )\ for each 
i e N by V a (T>)\ = X)p=o 1 ctpV(D l+p ) t . Then, for any semimartingales V, W and any a, /3 € T, set 

L a ,p(V, W) ij = V a (T)_ . Wptfl) + Wf,(J j )- • V a (T) 

for each i,j £ N. Furthermore, for any locally square-integrablc martingales M, N, M', N' and any 

a, (3, a', (i 1 £ T, set 

VZ$£, tfl ,{M,N;M' t N') t 
Secondly we introduce some notation related to the noise processes. Let 

1 oo oo 

- LVfll - £ y - - V f X. ] - 

n p=l n q=l 

(£ x and (£ Y arc obviously purely discontinuous locally square-integrable martingales on B if [SH6] holds (note 
that both (S l ) and (T J ) are F( )-stopping times). Furthermore, if ^ is cadlag, quasi-left continuous and both 
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(S l ) and (T- 7 ) are F^°^-predictable times, then we have 

1 oo 1 oo 1 oo 

« p=l ™ q=l n p,q=l 

On the other hand, though S k and T k may not be stopping times, we have the following result: 
Lemma A.l. The random variables I k and J k are -measurable for every k,t. 

Proof. Since {I k = 1} = {S" 1 ' < * < S k } = < * < S k ] U {t < S k < S*}], we obtain {I k = 1} e J" t (0) 

and thus / t fe is .F^-measurable. Similarly we can show that J t fc is -measurable. □ 

Due to the above lemma, both of the processes 3 t '■= Yl^Li H an d 3* := are F^^-adapted. 

Therefore, we can define the following processes: 

3C t = -3_*X t , % t = -3-.Y t , 9Jtf = • Mf , m^=-Z-*Mf, 

Then we set ii x = £ x + {k n ^/K)- 1 X, il Y = <£ y + (fc n \/&^) -1 ?), H x = € A " + (/^v^) -1 ^ and E y = 

Finally, for every i,j we define the process K l t ° by -fT*- 7 = ^-ngi s i + fe ™)n[fj TJ+ fc ™)n[o t)^0V Then define 
processes M" and M™ by 

M™ = M B , g (X, y)" + M„,,„, (it* , il y )? + M fllfl , (X, il y )? + M ff ,, 9 (H* , 30?, 
M? = M g , g (M x ,M Y )? + M 9 -, 9 (H x ,ii y )r + M g , g -(M x ,H y )™ + M g ,, g (iF, M y )«, 
where we set 

1 oo 

for any semimartingales V, W and any a.,/3 £ T. Note that the process M" is a locally square-integrablc 
martingale with respect to the filtration F under the condition [H6] due to Lemma 4.3 of [32]. 
Our proof is based on the following lemma: 

Lemma A. 2. (a) Suppose that [HI] (i) — (iii) and [H3]-[H6] hold. Suppose also that X = Y = 0. Then we 
have (3.1) with that W is the same one in Theorem 3.1 and w is given by (3.2), provided that the following 
three conditions are satisfied: 

(I) 6~ 1/4 (M" - M") ^ as n oo, 

(II) o~ 1/4 (M™, N) t as n -> oo for every t and any N £ {M x , M Y , M— , M— }. 

(III) For any M, A/' € {X, <£ X ,9JT— }, any N, N' £ {F, (£ y , ffl—} and any a, (3, a', e T. 

=b-^ ]r (A.i) 

as n — > exo /or every t £ M+ . 

(b) Suppose that [HI] and [H3]-[H6] ZioZd. T/ien we Ziawe (3.1) wii/i i/iai 14 7 is as in f/ie above and w is 
given by (3.3), provided that the above three conditions (I) -(III) are satisfied. 

Proof. First, it can be easily shown that we can replace the condition [A4] in the assumptions of Proposition 
4.3 and Lemma 4.6 of [32] by the condition [H4] (in fact, £' > 1/2 is sufficient). Therefore, it is sufficient to 
show that bn 1/4 (PHY(X,Y) n - M") ^> as n — > oo by the assumptions of the lemma and the conditions 
(II)-(III), and this convergence follows from Lemma 4.2 of [32] and the condition (I). □ 

According to the above lemma, it is sufficient to show that the conditions (I)-(III). 

For the proof we can use a localization procedure, and which allows us to systematically replace the 
conditions [H2]-[H3] and [H5]-[H6] by the following strengthened versions: 
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[SH2] (i) S i and T l are -predictable times for every i. 

(ii) [H2](ii) holds and the processes G, V G and N G are bounded. Moreover, V G is of class (Aa) for any 
Ae (0,1]. 

(iii) [H2] (iii) holds and the processes %, V x and iV x are bounded. Moreover, V x is of class (Aa) for 
any A G (0, 1]. 

(iv) [H2](iv) holds and the processes F l , V F and N F are bounded for every I = 1, 2, 1 * 2. Moreover, 
V F is of class (Aa) for any A 6 (0, 1] and every I = 1, 2, 1 * 2. 

[SH3] [H3] holds true. Moreover, for each V, W — X, Y, X_, Y the density process / = [V, W]' is bounded and 

of class (Aa) for any A <G (0, 1]. 
[SH5] [H5] holds true. Moreover, for each V = A x , A Y , A—, A— the density process / = V is bounded and 

of class (Aa) for some A <G (0, j). 
[SH6] [H6] holds and (/ \z\ 8 Q t (dz)) t& R + is a bounded process. Moreover, for every i,j = 1,2 the process ^ 

is of class (Aa) for any A e (0, 1]. 

In addition to the above localization procedures, we introduce two other ones. The first one is based on 
the following lemma: 

Lemma A. 3. Suppose that [H3] holds true. Then a.s. we have 



\M Z - M z \ 
limsup sup 1 3 " 1 < sup \[Z]' S \ 

\s-u\<6 



i sup — - ^ sup 

s,u€[0,t] . 25 log I °< s ^ 



for any t £ and every Z £ {X, Y, X_, Y_}. 

Proof. Combining a representation of a continuous local martingale with Brownian motion and Levy's 
theorem on the uniform modulus of continuity of Brownian motion, we obtain the desired result. □ 

We can strengthen Lemma A. 3 by a localization in the following way. Suppose there exists a positive 
constant K such that m&x ZG {x,y,x,y} \ [Z]'\t < K for all t > 0. For each k E N — {1}, set 



r k = inf { t G (0, oo) 



max sup > X + 1 

ze{X,Y,x,Y} gjtte [o,t] i/2fc =T logfc 

la— uK/s -1 



Then r/. is a stopping time since M is continuous and adapted, and r^, f oo a.s. by Lemma A. 3. This implies 
that if we have [SH3] then we can always assume that there exist positive constants K and 5 such that 

\ M s(u) - M z (uj)\ 
max sup : < A (A.2) 

ze{x,Y,x,Y} S}U e[o,t] y/26\log6\ 

s— u\ <8 

for all lu G f2 localized by (rfe). In the remainder of this section, we always assume that we have postive 
constants K and 6 satisfying (A.2), if we have [SH3]. Moreover, whenever we assume £' > 1/2, we only 
consider sufficiently large n such that Ak n f n < S, where we write f n = • 

On the other hand, the second one is as follows. Fix a positive number 7, and define v n by 

v n = M{t\r n (t) > f n } A mf{t\N? > b^}. 

By construction each v n is an F^-stopping time. Moreover, since [Hl](i)-(ii) imply that 

Nf^Opib- 1 ) foranyteK+ (A.3) 

due to Lemma 10.4 of [32], we have P{v n < T) — > as n — > 00 under the assumptions of the theorem. In the 
following we will only consider processes stopped at time v n , so that we always assume that 

r n it) < r n for any t e R+ and any neN ( A.4) 

and 

K < b' 1 ' 1 for any t G R + and any neN. (A.5) 
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Moreover, 7 is taken from the interval (0, | (£' — |) A ^j). This is always possible under the condition [H4]. 

Now we proceed to derive some consequences of these assumptions. In the following we fix M, M' £ 
{(M 1 )"", (<£ x ) Vn , (VJl—) Vn } and N,N' £ {(M Y ) Vn , (£ Y ) V ™ 1 (9Jl—) v ™}. Then we set 

M p > q = M(I p )M'{T q ) - (M{T»), M'(T q )), L p - q = M(I p )N'{J q ) - (M(T P ), N'(J q )) 

for each p, q. 

Note that (A. 2), (A. 4) and [SH6] implies that for any r £ [0, 8] and t £ K + there exists a positive constant 
C such that 



E 



sup \M(I p ) s 

0<s<t 



-En 



sup \M\I p ) s \ r 

0<s<t 



-En 



sup \N(J p ) s \ r 

0<s<t 



-En 



sup \N'(J p ) s \ r 

0<s<t 



for every peZ + . 

For any a, (3 £ T and p, q £ N, set 



<c(7-„|iog6„ir/ 2 

(A.6) 



I p q 

71 i = (j)-k n + l)\Jl j=(<j-fc„+l)Vl 



{[S i ,S i + fc ™)n[fj\fj+ fc ™)^0}- 



The following lemma is useful for obtaining various estimates in the proof. Throughout the discussions, 
for (random) sequences (x n ) and (y„), x n < y n means that there exists a (non-random) constant C £ [0, 00) 
such that x n < Cy n for large n. We denote by En a conditional expectation given J 7 ^, i.e., E a [-} := 

Lemma A. 4. Suppose that [SH3] and [SH6] are satisfied. Let a, f3 , a! , ft £ T , w £ [1,8] and t > 0. TTien 

(a) There exists a positive constant C\ such that 



E 



sup V (4, f ); p M(7*), 

)<s<t ~T 
— — p:p<Cr 

SU P E (Tpa,p)q-P N (J P )' 



rv- 








+ E 


sup 

0<a<t 


•uj- 








+ E 


sup 

0<s<t 



E C ct ,p(p,q)M(T v ) ! 



p:p<r 



sup E c c,i3{p,q)N(J p ) l 



< C*i(fc„fnilog6„|) ro/2 , (A.7) 
<C*i(fc n f n |log& n |) ro/2 (A.8) 



for any q, r £ Z+. 
(b) There exists a positive constant C2 such that 



£0 



sup 

0<3<t 



E 



■uj- 








+ E 


sup 






0<s<t 











+ E 


sup 






0<s<i 



p:p<r 



p:p<r 



< C2 ^Vfc„r„| log&„ 
log 6 



/or an?/ p' , q,r £ Z+, provided that w < 4. 
Proof, (a) Note that 

E W>«,/?)£-pM(F) s 

p:p<r 



(g-|-2fc n -l)Ar 

E (w^. 

p=( 9 -2fc„ + l)+Ar 



because ^> a is equal to outside of the interval [—2, 2]. 

First suppose that M £ {(M x ) Vn 7 (971—)"™}. Then, Abel's partial summation formula yields 



P-P<q p=(q-2k n + l) + /\r 

+ O0a,/j)£_( g+ 2fc n -l)Ar (^% ? +2fc„ - 1) A r ( s ) ~ Mg( q - 2hn+ i_) +Ar -l^ , 

hence by (A. 2) and (A. 4) we have 



sup 

0<s<t 



p:p<r 



< 



V2-4fc„f„|log(4fc n f„)|) < (k n f n I log b n I ) ro/2 . 
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Next suppose that M = ((£ ) Vn . Then, the Burkholdcr-Davis-Gundy inequality yields 



En 



sup 

0<s<t 



p:p<r 



<E 



3 E \(^-f\effhs*<t } 



ro/2' 



p:p<r 



Suppose that w < 2. Then, the Lyapunov inequality implies that 



En 



sup 

0<s<t 



J2 y> a ,p%-pM(P) 



p:p<r 



p:p<r 



za/2 



hence, by [SH6] and the fact that ip a ,p is equal to outside of [—2, 2] we obtain 



E{) 



sup 

0<s<t 



J2 (M n q - P M(P) 



p:p<r 



< fc- ro/2 . 



(A.9) 



On the other hand, if w > 2, then the Jensen inequality and the fact that tp a ,p is equal to outside of [—2, 2] 
imply that 



E{) 



sup 

0<s<t 



J2 (M n q - P M(P), 



p:p<r 



< k~™ /2 E 



{Sf<t} 



p:p<r 



hence, again by [SH6 
we conclude that Eq 

that E 



su Po<s<t 



and the fact that ip a ,p is equal to outside of [—2, 2] we obtain (A.9). Consequently, 
sup < s < t E P : P < r (i^)J-p M ( P )s ~ (k n r n \ log6 n |) . Similarly we can also show 

,/2 



J2p:p<r c a,0(P' ( l) M ( IP )s < (k n r n | log b n \ , and thus we obtain (A. 7). (A. 8) can be 
shown in a similar manner. 

(b) The claim immediately follows from (a), [SH3], [SH6], (A. 6) and the Schwarz inequality. □ 

The following lemma is a generalization of Lemma 2.3 of Fukasawa [17]. 

Lemma A. 5. Consider a sequence F = (T")jez + of filtrations and random variables (C™)j'£N adapted to 
the filtration F for each n. Let A™ (A) be an F -stopping time for each n € N and A which is an element of 
a set A. If it holds that there exists an element Xq G A such that N n (X) < A ra (Ao) a.s. for all A G A. Let 
zd G (1,2]. Then 



(a) ifEf=i iXo) E 




IT™ 


OO. 






(b) ifT^ Xa) E 




IT™ 


n — > oo. 






Proof. Note that 










sup 






AeA 



O p (l) as n — > oo, then sup Ae y 



E 



C?l^-i]} 



E 



Kj-i 



— » p as n 
}| =0 P (1) 



AT" (A) 



E {<?--®[<JW-i]} 



3=1 



< sup 

l<k<N"(\ ) 



k 
3=1 



where r?" = Q 1 — E [C"|^}-iJ ■ 

Let T be a bounded stopping time with respect the filtration F . Then the Burkholder-Davis-Gundy 
inequality and the C p inequality yield 







k 






E 








< CE 






3 = 1 







3=1 



for some positive constant C independent of n. Since E 



Ei^+Hc^jr} 
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OT 






^3 r 3- 


-l 


< E 










TO- 










T 








E 




E< 




< 2CE 















J2j=i \Q 



JQ = l l-SJ I 



E 



by the 



ic?rl^v-i 



by the Holder inequality, we obtain 



Lfc=l 
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Therefore, we obtain the desired result due to the Lenglart inequality. □ 

Now we cope with the main body of the proof. The following lemma is a version of Lemma 12.1 of [32]. 

Lemma A.6. Suppose that [HI] (i) -(h), [SH3], [H4], [SH5] and [SH6] are satisfied. Let A e {(A x ) Vn , (21—)""} 
and define 



for each t £ K+ . Then 

(a) 6^ 1/4 sup 0<t<T |I S | = o p (kl) for every T > 0. 



(b) If A — (A x ) v - and [SH2](ii) holds, we have b n 1 sup < t < T \\l s \ = o p (k%) for every T > 0. 

(c) Suppose that [Hl](iv)-(v) and [SH2](iv) are satisfied. Suppose also that A = (21—)"™. Then we have 
6„ 1/4 sup 0<t<T | I1L; | = Op(k^) for every T > 0. 



Proof, (a) By an argument similar to the proof of Lemma 12.1(a) of [32] we can prove h^ 1 ^ sup 0<t<T 
= o p (k„). Note that for the proof we do not need the strong predictability condition [A2] of [32] and it is 
sufficient to hold that £' > 3/4. 

(b) By an argument similar to the proof of Lemma 4.3 of [32], we can show that 



Therefore, we obtain 



CO 



II, = K ij Mp(J) j _ . A a (X)i = kl J2 c a ,p(p,q)M(J«)-P_ . A u 
i,j=i p,q=i 

hence it is sufficient to show that sup 0<t<r = o p (bi /A ), where fi t = £~ =1 c a , {p, q)M(J*)_lt . A t . 

First, since M(J") S = if s < T'" 1 , we have % = E P E q - q < P +iC a ,p(p,q)M(J q )-T P • A t . Moreover, 
(A.6) yields 



En 



sup 

0<t<T 



E E c a ,e(p,q)M(J«)-rl.A t 



< 



\/r n \logbn\, 



hence we obtain II t — £ p Ylq;q< P -i c a ^ (p, q)M{Ji)_F_ »A t + o p (6„ /4 ) uniformly in t e [0, T]. 
Next we show that 



sup 

0<t<T 



i t -E E {i>^) n q - p M{j")^rl.A t 

P q--q<p-i 



o p (b 



l/4\ 



(A.10) 



Since c a> p(p,q) = (ip a ,p)q- p = if \q -p\ > 2k n , we have 



E n sup V V {c a .p{p,q)-^ a .pY q _ p }M(ji)^_ .At 

Ct<T , , 
P q:q<(p-l)Ak„ 

<k n f n ■ ^k n f n \\ogb n \ < $ i '~tiy/\]a S b n \ = o p (&y 4 ) 
by Lemma A. 4(a), (A. 4) and [SH5]. In addition, we also have 



En 



sup 

0<t<T 



E E {^pM - (tPcM-p) M(Ji)-Tl . A t 

p q:k n <q< P -l 

T 



< Sup \Ca,p(p,q)-{lpa,p)q-p\ E 



p,q>k 



p,r-\p-<i\<kn 



E 



M(J«) t Lf\A' t \dt 



--O p (bl/ 2 ■ k n ■ y/f n \iogb n \) = O p (bl/ 4 ) 
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by (A. 6), [SH5] and Lemma 3.1 of [32]. Consequently, we conclude that (A. 10) holds true. Therefore, by 
integration by parts we obtain fi t = £ p Y Jm<P -i{^)q- p M{Ji) t A{P) t + o p {bl/ 4 ) uniformly in t G [0,T]. 
Moreover, since Abel's partial summation formula yields 

E E (i> a ,M- P M(J«)t{A(F>) t -A(rn t } 

P q-q<p-i 

= E E { WcM-p - Wa,|9) J- p -i } M(J«) t {A §PM - A RPM ) , 

P q:q<p—l 



we have 



sup 

0<*<T 



i'-E E (^^)^_ P M(>) t A(rf) t 

P q:q<p-l 



= o P (b l J') 



due to (A. 6) and the Lipschitz continuity of ip a< p. 
Now we show that 



sup 

0<t<T 



E E {M q l . p M{j^) t A' Rv ^\r p {t)\ 

P q-q<p-i 



= 0^). 



Lemma A. 4(a) and the Schwarz inequality yield 



E 



sup 

0<t<T 



\\ogb n \E 



E E (M n q - p M(J^) t {A(Tn t ~A' Rp ^\TP(t)\} 
p q-q<p-i 

f R?(T) 

E/ \K-a' rv -Ms 



( r r R p (T) 1 1 1/2 

^knf^logb^T 1 ' 2 IE Y \A' S - A' RP -,\ 2 ds \ 

{ [pJRP-HT) J J 



Further, (A.4), [SH5] and (A.5) imply that 



E 



r R p (T) 

E/ IK-A'^fds 



< E 



R"~ 1 (T)+2f n 

E/ 



L P 



d.s 



(A.ll) 



(A.12) 



n n 



for some A <E (0, |), hence we obtain 



E 



sup 

0<t<T 



E E w a ,p)^mH{A(?*) t -A , &-i\T*(t)\} 

P q:q<p-l 



<b'r 



-(3-A)-l- 7 ,i 



Since 4- (3 — A) > || by [H4] and 7 < we conclude that (A.12) holds true. On the other hand, since by 
Lemma A. 4(a) and [SH5] we have 



sup 

0<*<T 



Lemma 3.1 implies 



E E Wa lj9 )J- p M(j*MJ ep _ I {|i*(f)|-|r"|i {Jl p-i< t} } 

P q-q<p-i 



< V*nfn|log6„| SUp |r JV ?+ 1 | 
0<t<T 



sup 

0<t<T 



«' _1_3_ 1 



= O p (6 n 2T4 p |log6„|). 



Since p(£' + 1) > 2, we conclude that 

AT»+1 



sup 

0<t<T 



E E (V^-pM^M^pi 

p=l q:q<p—l 



(A.13) 
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Furthermore, with taking w = p A 2 and lC t = V J 7 ^ for each t <S R+, we have 



E 



sup 

0<t<T 



b~* E 



p=l 



£ ^ a ,p) n q . p M{J q ) t A' Rp ^\ 

q:q<p-l 



\H Rp -l 



<« (A^f„|log6 n |) T (iV? + l) sup G(w)r<6n 2 ^|log6„|^(iV^ + l) sup G(w)™ 

0<i<T 0<t<T 

by Lemma A.4(a) and [SH5]. Since (£' + l)ro/2 > 1, [Hl](ii) and (A. 3) imply that 



sup 6„ 4 ^ E 



0<t<T 



Y {i> a ,^- p M{J<i) t A! Rp ^\T p \ 

q:q<p-l 



\H R p-l 



-+ P 0, 



hence, note that M{J q ) t = Mf q — Mf q _ t if q < p < iV™ + 1, by Lemma A. 5 we obtain 

ATf+1 



sup 

0<t<T 



k J2 E (M n q - P M(HA' R ^G(ir RP - 

p=l q:q<p—l 



Therefore, [Hl](i), Lemma A.4(a), [SH5] and the fact that N% = O p (6~ 1 ) yield 



sup 

0<t<T 



b n E E (M^-pM(.HA' RP ^G RP - 

p=l q:q<p—l 



(A.14) 



Moreover, note that M(J q ) t = if t < T 5-1 , < f k+l and the fact that tp a ^ is equal to out side of 
(-2,2), Lemma A.4(a), [SH2](ii) and [SH5] imply that 



En 



sup 

0<*<T 



& «E E ^a^-pM^^tA'^-xG^-ll^p-lyt} 
p q:q<p-\ 



< 



\/k n r n \ log b n \ sup b n ^ 1 



0<t<T 



{ R p-2k n -l <t<R p~l 



}<bi /4 VK\h^K\, 



hence we obtain 



sup 

0<t<T 



fe «E E {^pT q . p M(J q ) t A' RP ^G R p- 

p q:q<p~l 



Here we show that 



sup 

0<t<T 



6 ™E E (^)q-pM(J q )t(F RP -i -F R(p - <)+ 

p q:q<p-l 



= o P (bl/% 



(A.15) 



(A.16) 



where k' n = 2k n + 2 and F = A'G. Let r k = inf{s e R+|iV s G = k} (k = 1, 2, . . . ) and set T = {r k \k = 
1, . . . , Ntf}. Then, by Lemma A.4, [SH2](ii), [SH5], (A.4) and (A. 5) we have 



E 



sup 

0<t<T 



t»E E (^)q- p M(H (F RP -1 - F Rip . <)+ ) 
p q:q<p-l 



<yhnXn\ log b n \E 



bn E l-^-R"- 1 — ^ 1 R(P- fc ^)+ | 1 



< 



VkJ^gKl {(knfn) 1 ^-^-- 1 + b n E [#!"]} 
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for some A g (0, |), where I n = {g€ N|T n [i?^ fc ">+ , i? 9 " 1 ) 7^ 0}. Since for sufficiently large n we have 
#1" < 4k n N!f, we obtain 



sup 

0<t<T 



p=l q:q<p—l 



--0 



because 7 < |£' — |, and thus (A. 16) holds. After all, it is sufficient to show that sup 0<t<T |A t | — > p as 
n -> 00, where A t = b\ T,pT, q -. q< p-i(4><*,p)q-pM(J q )tF RP - k > n . 

Let = '^2 p . p>q+ i{'^cx,p)^- p F RP -k' n for each g. Then, by construction H q is 1 -measurable and we 
have At = ^2 q H q M(j q ) t . This implies that the process A t is a locally square-integrable martingale with 
respect to the filtration F and its predictable quadratic variation is given by (A) f = b'n J2 q \H q \ 2 (M){J q )t. 

1 /2 

Since \H q \ < k n , we have (A)t = O p (b n ) = o p (l). Consequently, the Lenglart inequality completes the 
proof of the lemma. 

(c) By using [Hl](iv)-(v) and [SH2](iv) instead of [HI] (i)-(ii) and [SH2](ii) respectively, we can adopt an 
argument similar to the above for the proof. Note that l\ is %^-adapted and J\ is -adapted. This can 
be shown in a similar manner to the proof of Lemma A.f . □ 

The last lemma is a version of Proposition 4.4 of [32], which deals with the condition (III): 

Lemma A. 7. Suppose that [HI] (i) — (iii), [SH3], [H4], [SH5] and [SH6] are satisfied. Then we have (A.f) as 
n — > 00 for every t G R+ if 

(a) M,M' G {X,£ x }, N, N' G {Y,€ Y } and [SH2] (i) -(iii) hold true, 
or 

(b) [Hl](iv)-(v) and [SH2] hold true. 

Proof, (a) We decompose the target quantity as 

]T (K?I&').(L« fi (M,N),L%,(M',N%- E (^^") • K%^,{M, N; M\ N') t 

i,3,i',j' 

=Ai t + A 2lt + A3* + A 4 , 4 , 



where 



and 



A 2 , ( = £ (^'^ i ')-({^(^i^(^'}-(^a(^S^(2) i '>)i 

- ]T (K^K^') . {{N p {jy,N' p ,{jy')- . <M Q (2)\ M^Z)*')), 

A 3li = £ • ({M Q (X) l _i^,(J%} . (JV /3 (^,M;,(X) i ')) t 

- ]T (^'^').((M a (^Sj^,(^>_.(i^(^)^^(f) i '>) t , 

- £ {iClfCJ ) . ((iV, 8 (J) J ,M^(X) l V • (M.(X) l ,iVs,(J) J ')) t . 
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Consider Ai t first. By the use of associativity and linearity of integration, we can rewrite Ai t as 



where M"' = M a {l) l M' a ,(iy' - (M a (2)'\ M' al @f). Moreover, by an argument similar to the proof of 
Lemma 4.3 of [32], we can show that 

{K^te'J) . {m5 . (N p (jy,N' p ,(jy')} t = . (N^jy,N' ,(jy')} t . 

Therefore, we obtain 

A 1)t = E K lJ K^'{My\(N p (jy,N' ,(jy')} t 



K E c a , (p, q )c a ,, 0/ (p',q)M p J p ' .{Jl.(N,N')} t 



p,q,p — l 

hence it is sufficient to show that A M := X^g, P '=i c <x,p{p, q)c a ',0'(p', q)M p ' p • {Ji • (N,N')} t = o p {bl/ 2 ). 
Since Mf> p ' = if s < S py P'-\ we have 

a m-E E E c Q ,^, g ) CQ ^p^)^- y ^-^^^ 

q p:p<q-l p':p'<q-l 

by Lemma A. 4(b), (A. 3) and [H4]. Moreover, by an argument similar to the proof of (A. 10), we can show 
that 

a m = E E E (V'a^)^ P (^^^)^ ^) 'M^'.{J5.(7V,iv')} t + p(^>y 2 ). (A.i?) 

q p:p<q-l p':p'<q-l 

Therefore, integration by parts yields 

A m = E E E (^,/ 3 )^ P (^^ /3 0^M^ p '(^^ r, >(^) t + o P (^ /2 )■ 

q p:p<q—lp':p'<q—l 

Now we separately consider the following two cases: 

Case 1: N = N' = (M Y ) Vn . First, by an argument similar to the proof of (A. 11) (using Lemma A. 4(b) 
instead of (A. 6)) we can show that 

A M=E E E (^^^^(^^^Mr'^iVO^^. + o,^/ 2 ). 

q p:p<q-l p':p'<q-l 



Next we show that 

a m = E E E (^^)^(^^o«- P 'Mr'(^^')^_ 1 |r«(i)i+o p (6y2 ) . 

q p:p<q—lp':p'<q—l 

Since (A.4) and [SH3] yield 

E [\(N,N')(T^) t - (N,N'Y Rq ^(t)\] 



(A.18) 



<E 

for any A > 0, we have 



R q - 1 {t)+2f„ 



R«- l (t) 



E 



\{N,N')' u -{N,N')' m -A 



Ri- 1 



du 



<6i 



E 



E E E (V^/ 3 )^ P (V'a^ /3 0^p'M^ p '{(^^(^ 9 ) t -(^ r ,^r fl5 - 1 |^ 9 (^)|} 

q p:p<q-l p':p'<q-l 



\logb n \E 



J2\(N,N')(n t - (N,N'Y Rq ^(t)\\ 



<&F~ f ~ A ~ 7 |log&„ 
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by Lemma A.4(a) and (A. 5). Since §£' - 2 > ^ by [H4] and 7 < ^,we can take A € (0, §£' — 2 - 7) in the 
above and thus (A. 18) holds true. On the other hand, since 

E ° E E E (^)^ p (^^^0^Mr'(A,7V')^- 1 {|^«(^)|-|^|l {i^ ,- 1 < t} } 

<j p:p<q—lp':p'<q-l 

<fc„f„|iog& n ||r JV "+ 1 | = o p (&«' +1/2 |i°sM) - o p (bl/ 2 ) 

by Lemma A. 4(c) and [Hl](i), we obtain 

JV™+1 

A b*= E E E (^,/ 3 )^ P (V'a^ /3 o^ P 'Mr'(A,7v')' i^ ,_ 1 |L''|+ 0^) (6y 2 ). 

<j=l p:p<q—lp':p'<q—l 

Moreover, by an argument similar to the proof of (A. 14) we can conclude 

g=l p:p<q—lp':p'<q—l 

Further, an argument similar to the proof of (A. 15) yields 
Ai, t -6„£ J2 E (^)™_ p (Vw^- P <^ 

q p:p<q-l p':p'<q-l 

Now we show that 

Ai,t = 6 n E E E (^, /3 )^_ p (V'aV /3 0^<' P > i? ( ? -^) + +Op(^ /2 ), (A.19) 

q p:p<?-lp':p'<5-l 

where fc' t = 2k n + 2 and F = (N, N')'G. By an argument similar to the proof of (A. 16), we can prove 



5 «E E E (^r q . P (^'^~ P ' M r' (f, 

q p:p<q—lp':p'<q—l 

=0 P ((^'^ )(f_Ah7 + fe|'l|log6„|) 



for any A > 0. Since 7 < | (£' — |), we can take A such that (£' — 5) (| — A) — 7 > | in the above. Thus 
we conclude that (A.19) holds true. Now we have 

Ai, t = b n J2 H(iy'M'{P) t + b n H(2fM(P) t + 2b n £ H(3) p Mf' p + o p {b)l 2 ), 

p' p p 



where 



q:q>p' + l 
q:q>p+l 



{i>a,p)q-pM (l P )t 



p:p<p' 



F 



p':p' <p 



^1- k n )-f- ' 



and H(3) p = E g: q>p(^^)q_ p (V'a',/3')Q-p^( g -Kj+ ■ SinCG H(l) p ' is J^/.i -measurable, we have 



E* 

P > 



b^Hiiy' M(p) t 



Sp'-i 



^El^ 1 )' 



Moreover, Lemma A. 4(a), [SH2](i), [SH3] and the fact that ip a ',p' is equal to outside of (—2,2) yield 
^[^(l)* 5 '! 2 ] < fc 2 • fc„f„| log 6„ I , hence we obtain 



E11 



E^ 

p' 



b]l 2 H(iy' m(i p ') 



Sp - 1 



= O p (k n r n \ log&„|) = o p (l) 
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by (A.3) and the fact that E (M)(T P ')t\F§ P > is Jr( ° ) 

-measurable. Therefore, Lemma A. 5 implies that 
bl/ 2 £ p , H{l)P'M{IP') t = o p (l). Similarly wc can show b]j 2 £ p H{2) p M{I p ) t = o p (l) and bT £ p H(3) P M P ' P 

~ 1/2 

= o p (l). Consequently, we conclude that Ai jt = o p (b,i ). 

Case 2: N = N' = (<£ Y ) V ". In this case we have (N,N') t = p-E£Li !{?»<*} duc to [ SH2 ](i), hence we 
have 

A M = -iEE E (^^)^p(^^o^<' p '*| 2 9 i { f a <t}+ o P (^ /2 )- 



q p:p<q p':p' <q 



Therefore, an argument similar to the latter half of Case 1 yields Ai^ = o p {b n 

Consequently, wc conclude that Ai. t = o p (&; 4 ■ bl/ 2 ). Similarly we can also show that A2,t = o p (k^ ■ bl/ 2 ). 
Next we consider A 3jt . By the use of associativity and linearity of integration, we have 

where L l i = M a (I) 1 N'p,(jy — (M a (I) % , N'^^JY ). Therefore, by an argument similar to the above we 
obtain A 3 , t = ki E~ , pW=1 c a , (P, ip', q')L P - q ' • {&Jl • (Af', N)} t . Since P n J" = if |f>' -«| > 1, 

we obtain A 3 , t = A; 4 E P: <j' E P ', 9 :| P '- g |<i c a ,/3(p, q)c a >,p>(p', q')L p _l q • {P Jl • (Af, 2V)} t . Hence, Lemma A.4, 
the Lipschitz continuity of a' and the fact that c a ^(p,q) = if \p — q\ > 2k n yield 



P, 9' P',g:|p'-9|<1 



-En 



<fc,t ■ K ■ \fk^r n \ log 6„ | • fc„ 1 = o p (fc 4 • &y 2 ), 

and thus we obtain fc~ 4 A 3 , t = Y^,q, q >=i c a,p{p, q)c a >,p> (q, q')L p _l q • {Jl • (M',N)} t + o p (bl/ 2 ). Then, an 

argument similar to the above yields /c~ 4 A 3jt = o p (bl/ 2 ). Similarly we can also show that fc~ 4 A.4. t = o p (bn ), 
hence we complete the proof of (a). 

(b) Similar to the proof of (a) (note that I\ is "H-.-adapted and J\ is ^-..-adapted.). □ 

Proof of Theorem 3.1. First, the condition (III) immediately follows from Lemma A. 7. Next, Lemma 
A. 6 and integration by parts imply that the condition (I) is satisfied. Finally, since for any locally squared- 
integrable martingales L, M , N and any a, (3 € T we have 

{M a ,p(M,N) n ,L) t 

k'i . {M a (jy_ . WMp&nt + E Ri - • { W)- • WTumt 



(ipHYk n y 



due to Lemma 4.3 of [32], Lemma A. 6 yields the condition (II). Consequently, we obtain the desired result 
by Lemma A. 2. □ 



B Proof of Lemma 5.1 



Exactly as in the previous section, we can use a localization procedure for the proof, and which allows us 
to assume the conditions [SH3], [SH5]-[SH6], (A.2) and (A.4). 

First we prove two lemmas about the point process generated by the refresh times. 

Lemma B.l. Suppose that [Hl](i) and [K p ] for some p G (1, 2] hold true. Let (H n ) be a sequence of stochastic 
processes, and suppose that H n is H n -adapted for each n and sup < s < t \H™\ is tight as n — > oo for any t > 0. 
Then we have 

b n ^2 H R k - x 



sup 

0<s<t 



fe=l 



T I Jl 

E §^ir fc i 

fc=l LT « fc - 1 



o P (bir 1/p ) 



as n — > oo for any t > 0. 
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Proof. Since the assumptions yield 

2Vf+l 



E E 

k=l 



i H n 



^Rk-l 



^+1 rr,, 

6 » E 7^ 5ziG 0')? = o p( 1 ). 



fc= i ^fl^-i 



by Lemma A.5 we obtain sup 0<s<t b£ '|Efe=r l rfc l/G£*-i ~ 6 » EkS"* ^-^(lj^^/G"^, 



-1,. ^iV"+l rifi\n inn l v-^.,™ + l 



O p (l). Evidently we have sup < s < t 6£ I^Eli ^-^(1)^-.^., - &„E fe =i = M 1 )- 
hence we obtain the desired result. □ 



Then 



Lemma B.2. Suppose that [Hl](i) and [K4/3] hold true. Suppose also that h^bl/ 4 — > as n — > 00. 

su Po<s<t h~ b n {Ng — N" g _ h > ) is tight as n — > 00 /or any i > 0. 

Proof. Since ES , , +2 l^l/G^ < (/i„ + sup 0<u<t |r JV " +1 |) sup 0<u<t G^ 1 for any a € [0,t], the 
desired result follows from the assumptions, Lemma 3.1 and Lemma B.l. □ 

Next we consider the asymptotic properties of the estimators for the noise covariance matrix. 
Lemma B.3. Suppose that [HI], [SH3], [H4], [SH6], (A.2) and (A.4) are satisfied. Then 

E +b^m§^\i k \} 



sup 

0<s<t 



1 



sup 

0<s<t 



sup 

0<s<t 



fc=l 
iV"' 2 + l 



W 22 "F E {*|V 1 +^ 1 [Z]f fc - 1 |i fc l} 



*;=i 

JV™+1 



= o P (by 4 ), 
= ° p (&i /4 ), 



^C 1 ) 12 -^ E {*S- 1 1 { s^fn+ b " 1 ^^^- 1 l /fc * Jfc l} 



o p (b. 



1/4-1 



(B.l) 
(B.2) 
(B.3) 



as n — ^ 00 /or every t > 0. 

Proof. We can consider each of (B.l) and (B.2) as a special case of (B.3) by taking X = Y and S k = T k , 
hence it is sufficient to prove (B.3). Furthermore, by symmetry it is sufficient to show that 



sup 

0<s<i 



{S<==T*=} 



fe=l 



o P (^ /4 ), 



).-U*(J*) a ) 



oM /4 )- 



where 7«(1) 12 = E^+i^Xg* - X^_0(Y fii+1 - Y^). 
First, by (A.3), (A.4), (A.2), [SH3], [H4] and [SH6], we have 

sup ^(l) 12 - J - V(iX x (J*) a - ll x (^'- 1 ) s )(if y (J fe+1 

0<a<t { V 

Next, integration by parts yields 

= - E {L(!d x ,il Y ) L s ' k+1 - L(ii x ,U Y ) k ' k - L(tt x ,tt Y ) k ~ 1,k + L(U X ,il Y ) k ~ 1,k+1 } 



+ J2[£ x , £ y p fc n J k ) s + rg,§](J* n J fc ) s 
ft fc ™ ft 

=:Ai,„ + A 2 , s + A 3jS . 

Combining martingale properties with (A.3), (A.4), (A.2) and [SH4]-[SH6], we obtain supq< s <j |Ai, a | = 
o p (bl/ 4 ). On the other hand, (A.3), [SH6] and the Doob inequality imply that A2, s = p- Eft i ^ 1 R k ^{s k =f k <s} 
O p (bl/ 2 ) uniformly in s £ [0, t]. Therefore, by arguments similar to the proofs of (A. 12) and (A. 13) we obtain 
A2, s = -p- Eftii +1 ^R k -^{s k =f k } + °p( fo " /4 ) uniformly in s <G [0,i]. By applying a similar argument to A 3jS , 
we complete the proof of the lemma. □ 



32 



Finally we consider the asymptotic property of the estimator 5[/]™ for the asymptotic variance due to 
the endogenous noise. For this purpose we first analyze the more general quantity 'B a ^(V,W) n . For any 
semimartingale V, W and any a, f3 £ T, we introduce an infeasible version of this quantity: 



oo 



i=l 



Moreover, we define the processes M^,(V, W) n and M^(V, W) n by 

<^ w)? = e (<f> a , r q - p v(Tn- • w(j«) u 

p,q:q—k n <p<q—l 

mS(v, wo? = 2 (^,o)^_ p ^(>)„ . 7(f«) t 

p,q:q—k n <p<q—l 

and set M^fV, W) n = M^(V, W) n + M^ p (V, W) n . Then we have the following results: 

Lemma B.4. Suppose that [HI], [SH3] , [H4], [SH5]-[SH6], (A.2) and (A.4) are satisfied. Let V € {M x , £ x , 
M^, A x , W G {M y , £ y , SR^-, A y , 2t^} and a, /3 G T. Then 

as n -> oo. 

Proof. Fix a t > 0. Since 

~ pAq 

s^(v;wo? = — ^ 2 <_^_^(/ p )^(J 9 ) s , 

p,q: \q— p\ <k n i—(p\/q—k n -\-l)\/l 

by integration by parts we can decompose the target quantity as 

1 



pAq 



Pi«:|9-p|<fcn i=(pVg-fe„+l)Vl 



»2,s ~r J»3,s- 



First consider B M . Noting that ^(/ p ) s = if s < S^ 1 , we have 



sup 

0<s<t 



Moreover, since 



£ a;_,^_ i V(/»)_.H'(J«), 



p,q:q-k n <p<q-l i=(q-k n + l)\/l 



Opib'J 4 ). 



fcn-1 



E E • = E E 

p,q:q— k n <p<q— 1 i= (q— A; n +l)Vl p,q:q—k n <p<q—l i—q—p 

q~>k n q>k n 

we obtain Bi, s = E p , g:9 -fc„<p< 9 -i (^,/3)g-p^(^ P )- • W"0^% + 4 ) uniformly in s e [0,t] by using the 

Lipschitz continuity of a, /? and the martingale property of W if W G {M y , l£ y , 9Jt— }. Similarly we can show 
that B 2 , s = E P , q -.q-k n <p< g -i(Mq- p W(J p )- • ftf*). + OpibT) uniformly in a € [0,t\. 

Finally, since I p n J 9 = if |g — p\ > 1, by using the Lipschitz continuity of a and j3 and the fact that 
a(x) = (3(x) = if x ^ (0, 1), we obtain 

p 



3>3,s 



p i = ( p _fe n +i)vl 
-1 V^P 



uniformly in s G [0, i]. Since k n 1 EL(p-fc„+i)vi a p-i^ P -i = <?W( ) + °p( fc ra uniformly in p > k n by the 
Lipschitz continuity of a and /3, we conclude that sup 0<s<t |B 3 s — <^ a ,/3(0)[V, W] s \ = o p {b\ ). Thus, we 
compete the proof. □ 
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Lemma B.5. Suppose that [HI], [SH3], [H4], [SH5]-[SH6], (A.2) and (A.4) are satisfied. Let M G {M x ,€ x , 
Wl*-}, N G {M Y , (£ y , £DT^} and a, (3 G T. 2%en 

(a) su Po < t < T |M a)/3 (A\iV)?| = o p (6y 4 ), su Po < t < T |M a ^(M,A 2 )?| = ^(b 1 / 4 ) and sup < t < T IM^A 1 , A 2 )?| 
= o p (fe,V 4 ) osn^co /or any A 1 G {A x ,2l^}, A 2 G {v4 y ,2l^} and any T > 0. 

(b) sup < s < t |M ai(9 (A/,iV)' s l | = O p (&y 4 ) asn^oo for any t > 0. 

Proof, (a) First, M.^\(A 1 ,N) n is obviously a locally square-integrable martingale and we can easily prove 



(M^(A x ,iV)")t = o p (6„ / ). Thus we have sup < t < T IM^A 1 , N)?\ = o p (bJ ) by the Lcnglart inequal- 
ity. On the other hand, since the quantity M^^A 1 , N) n has asymptotically a structure similar to that 
of the process II defined in Section A (see Eq. (A. 10)), we can adapt an argument similar to that of the 
proof of Lemma A. 6. Hence we obtain supg<f<y \M {2) JA\N)?\ = o p (bl/ 4 ), and thus we conclude that 

sup 0<(<T |M Qi ^(^4 1 , N)"\ — o p (bl/ 4 ). Similarly we can show the other claims, 
(b) Since 



e>/,AT) t = E E E 

q p:q~k n <p<q—l p':q — k n <p' <q—l 



K.fi ( V 5 J ( hr~ ) M(P)-M(P )_ . (N)(J*) t , 



(M^*fl(M, N) n ) t has asymptotically a structure similar to that of Ai ;t defined in Section A (see Eq. (A. 17)). 
Consequently, we can adapt an argument similar to that of the proof of Lemma A. 7, hence we obtain 

2 



(M%(M,N) n ) t 



E ^ ( hr 1 ) (M){P)t{N){J% + o p {b]/ 2 ). 

a,q:q—k n <p<q—l 



Therefore, by an argument similar to the proof of Lemma 4.6 of [32] we can show that b„ 1 ' 2 {M.^ g(M, N) n ) t 
converges to a random variable in probability. In particular, (M^(M, N) n ) t = O p (bl/ 2 ). Similarly we can 



( 2) 1 /2 

prove {M.^ g(M, N) n ) t = O p (bn ), and thus the Lenglart inequality yields the desired result. 

Now we can prove a lemma about the estimator S[/] n . 
Lemma B.6. Suppose that [HI], [SH3], [H4], [SH5]-[SH6], (A.2) and (A.4) are satisfied. Then 



□ 



sup 

0<s<t 



k=l 



k=l 



O p (b 



1/4n 



as n — > oo for any t > 0. 

Proof. Note that (0) = </>/',/» (0) = and (0) = —</>/',/' (0) = — ll/lll by integration by parts. 
Since we can easily prove Z fJ , (X, Y)» = E fJ , (A, F)« + 5,^ (il x , F)™ + H/,/" (A, il y )« + H /v « (il x , iF)? + 
Op(6„ 1 ^ 4 ) uniformly in s G [0,i], Lemma B.4 and B.5 imply that 



sup 

0<s<t 



&y 2 s /i/ ,(x ) Y): - n/ii 2 E - [^a(j p ) s } 



= o P (&y 4 ). 



Then, by arguments similar to the proofs of (A. 12) and (A. 13) we obtain 



sup 

0<s<t 



b^~ fJ ,{x,vY s -\\f\\l{ E \x,Y]s*-Ai k \- E [x,y\f.-AJ k 



k=l 



k=l 



O p (# 4 ). (B.4) 



In a similar manner we can show the equation obtained by replacing bl/ 2, Efji(X,Y)™ with —bl/ 2 ^fj(X 1 Y)" 
in (B.4). Consequently, we obtain the desired result. □ 

Proof of Proposition 5.1. (a) The claim immediately follows from Theorem 3.1. 
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(b) First, since 



JV"' 



+1 



E - E 



fe=i 



k=l 



{S k - 1 >(s-h„) + 



k=l 



} {m_, + b- l \x]'s k -Ai k \}, 



Lemma A.5, B.l and B.3 yield dj™(l) n = E fc = 
|iV"— iV™' 1 ! < 1, again using Lemma B.l, we obtain c*7™(l) 1:L 



-j-11 

L {S k - 1 >(s-h n ) + }^'s k - 1 



o p (l). Then, noting that 



o p (l). Now, [HI] and the dominated convergence theorem imply that 07™(l) u = £p h J^ 8 _ h j ^ /£r"du+ 



o p (l). Since fe" 1 /^ -> <?~ 2 and /i" 1 J ( S s _ h ^ V^/G^du -> *"/G"_ a.s., we conclude that d^(l) 
e- 2 ^ IG n s _ as 



11 



(s-hnh 

n — > 00. The tightness of sup 0<s<t |<97"(1) | follows from Lemma B.2 and B.3. Similarly 
we can also show that the claims about the others respectively. 

(c) An argument similar to the proof of (b) with using Lemma B.6 instead of Lemma B.3 imply the 
desired result. □ 



C Proof of Theorem 5.2 

Lemma C.l. Suppose S l = T l for every i. Suppose also that (A. 3), [H3]-[H6], (5.2) and (5.3) are satisfied. 
Then b~ 1/A {PHY(X, Y)" - PHY(X', Y')"} ^ as n oo, where X' si = X S i + Ajef, + p,lb~ 1/2 X(P) and 
r si = Y si + \le^ + ixlb- 1/2 Y{P). 

Proof. By a localization procedure, we can replace [H3] and [H5]-[H6] with [SH3] and [SH5]-[SH6] respec- 
tively Furthermore, a localization argument similar to that in the first part of Section 6 of [32] allows us to 
assume (A. 4) and that there exists a positive number C such that 



Nt < Cb~ x for any t g R+ and any n g N. 



(C.l) 



Set A* = YlT=uK> f° r eacn m e Z + . We define the random variable ef by if = J2lt=o ^i+i e si-« f° r 
every i. Then we have 



hence we obtain X)«=o ^u£gi- 



u—l u—Q 
C X (?X ?x 



\o6gi — (ef — ej_i). Combining this formula with Abel's partial summation 



formula, we obtain 
fc„-i 



E A (.9); 1 E ^ 

p=0 \m=0 
fcn-1 



X 

gi+p-v 



fen-1 



fe„-l fcn-1 

p=0 p=0 



k„-l 



=\l Hg) n P 4^ + E a2 (5);(Cp - Ci) = *S £ A( 5 ); e f 1+P + £ a 2 ( 5 ) p "? 

p=0 p=0 p=0 p=0 

for every i, where A 2 ( 5 )« = A(g)% +1 - A(g); (note that J^" 1 A 2 ^ = 0). Since 



~x 

i+p 



E I s [ e ip e i+p+i] I - E E 



< 



E i^+i 1 



vu=0 



and J2u=i < E^Li ES1 U l A il = E£Li «Ml < °°, w ^ have 

21 



E 



p=0 



i+p 



fe„-l 



fe n — 1 00 



= E A 2 ( 5 ) p "A 2 ( ff )» £: [ef +p ef +p ,] < E E 1^° [CpCp+J I S * 



p,p'— 



p=0 Z=0 
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uniformly in i. Similarly, we can show that 



E 



k„-l 



' i+p 



kn-1 



p=0 



p=0 



uniformly in i, where X_{I h ) = X_ S k — X_s k -i f° r eacn Therefore, we have i?[|X ff (Z 4 ) — X' g (I l )\ 2 ] < 
fc~ 3 6|~ 1 uniformly in i. Hence the Schwarz inequality, (A. 4), [SH3], [H4], [SH5] and (C.l) imply that 
£[sup < s < t \PHY(X,Y)^-PHY(X',Y)^\] < bt' 112 = o(6 I 1 / 4 ). Similarly we can show that &~ 1/4 {PffP(X', Y) 1 
-PHY(X',Y) n } ^ 0, and thus we complete the proof. □ 



Proof of Theorem 5.2. By applying Theorem 3.1 to PHY(X', Y') n in the above, we obtain the desired 
result. □ 



D Proof of Theorem 5.3 

First, we can easily show that &~ 1/4 (H 9ii7 (X, Y) n - {E g ^(X,Y) n + Z g , , g (U x ,Y) n + E gtg ,(X,il Y ) n + 
'Bg' t g'(ii x — ^> 0asn-> oo. Therefore, noting that integration by parts yields <f> gig > (0) = <f> g ', g (0) = 0, 

we have b n 1 ^ 4 {MRC(X, Y) n — M n } as n — > oo from Lemma B.4, Lemma B.5(a) and the proof of Lemma 
B.3, where M™ = M S , 9 (M X , M Y ) n + M s /, ff (ii x , M Y ) n + M g>g > {M x ,tt Y ) n + M g > ig > Since M" has 

a structure similar to that of M" defined in Appendix A (see also the proof of Lemma B.5), we can adopt 
an argument similar to the proof of Theorem 3.1 and conclude that the claim holds true. □ 
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